grpc and protobuf: roles of a server and a client - protocol-buffers

I'm new to grpc and protobuf, and I'm trying to understand if grpc can fit my needs. Basically I have a piece of software which can invoke a script (bash or python) at certain stages and pass the script some parameters (for example, transaction status, some values, etc.), so I'd like to pass these parameters over grpc, i.e. the grpc communication has to be initiated by my script.
I know there is grpc python library, so I'd like to take advantage of those in my script. However it isn't quite clear to me if my script has to act as grpc client or server? Examples I have seen, are quite simple - request/reply, where requests are made by a client, and the server replies; this is not exactly what I'm having in mind.

Your question is vague making it difficult to provide guidance.
Stackoverflow prefers developer (coding) questions and open-ended guidance tends to be discouraged.
Couple of things:
Essentially gRPC is a mechanism by which something calls (invokes) functions|methods on something else. Usually (but not necessarily) the something else is accessed via a network. The basic idea is that you want to be able to call some procedure (function|method) e.g. something of the form add(a,b) but the thing where add is actually implemented|performed isn't your local machine but is remote. Ergo, Remote Procedure Call (RPC) and "g" for (perhaps originally) "Google"
Since gRPC is just (remote) procedure calling, there is often a concept that the caller is the client and the thing being called is a server but, these concepts are fluid and a client can be a server and a server can be a client too (depending on who's initiating the call).
gRPC is often (but not necessarily) used instead of REST, GraphQL and (many) others. It's important that you be aware of the "price" you pay for gRPC's benefits. You must define a schema for your messages. Messages are sent (over a network) using a(n efficient) binary format (i.e. non-human readable). gRPC uses HTTP/2. You must have an implementation for your language to be able to use gRPC (Python is supported; many languages are).
gRPC implementations vary but the major implementations support synchronous and asynchronous calls, request-response and client, server and bidirectional streaming.
In many cases, REST|HTTP is easier to use because it sends human-readable "messages", there are many tools (e.g. curl) available, and everyone's been using it forever.
I encourage you to read the content on the framework's site

Related

Simplest C++ library that supports distributed messaging - Observer Pattern

I need to do something relatively simple, and I don't really want to install a MOM like RabittMQ etc.
There are several programs that "register" with a central
"service" server through TCP. The only function of the server is to
call back all the registered clients when they all in turn say
"DONE". So it is a kind of "join" (edit: Barrier) for distributed client processes.
When all clients say "DONE" (they can be done at totally different times), the central server messages
them all saying "ALL-COMPLETE". The clients "block" until asynchronously called back.
So this is a kind of distributed asynchronous Observer Pattern. The server has to keep track of where the clients are somehow. It is ok for the client to pass its IP address to the server etc. It is constructable with things like Boost::Signal, BOOST::Asio, BOOST::Dataflow etc, but I don't want to reinvent the wheel if something simple already exists. I got very close with ZeroMQ, but non of their patterns support this use-case very well, AFAIK.
Is there a very simple system that does this? Notice that the server can be written in any language. I just need C++ bindings for the clients.
After much searching, I used this library
https://github.com/actor-framework
It turns out that doing this with this framework is relatively straightforward. The only real "impediment" to using it is that the library seems to have gotten an API transition recently and the documentation .pdf file has not completely caught up with the source. No biggie since the example programs and the source (.hpp) files get you over this hump. However, they need to bring the docs in sync with the source. In addition, IMO they need to provide more interesting examples on how to use c++ Actors for extreme performance. For my case it is not needed, but the idea of actors (shared nothing) in this use-case is one of the reasons people use it instead shared memory communication when using threads.
Also, getting used to the syntax that the library enforces (get used to lambdas!) if one is not used to state of the art c++11 programs it can be a bit of a mind-twister at first. Then, the triviality of remembering all the clients that registered with the server was the only other caveat.
STRONGLY RECOMMENDED.

Management layer above Thrift

Thrift sounds awesome but can't find some basic stuff I'm used to in RPC frameworks (as HttpServlet). Example of the things I can't find: session management, filtering, upload/download progress.
I understand that the missing stuff might be a management layer on top of Thrift. If so, any example of such a layer? Perhaps AOP (Aspect Oriented)?
I can't imagine such a layer that compiles to all languages and that's I'm missing. Taking session management as an example, there might be several clients that all need to do some authentication and pass the session_id upon each RPC. I would expect a similar API for all languages doing so.
Anyone knows of a a management layer for Thrift?
So thrift itself is not going to help you out a lot here.
I have had similar desires, and have a few suggestions:
1. Put your management objects into the IDL
Simply add an api token or common transfer data struct as a parameter to all of your service methods. Set it as parameter id 15 so that it will always be the last parameter, even if you add others in the middle.
As the first step in your handler you can validate/store/do whatever with the extra data.
This has the advantage that it is valid in any platform that thrift supports.
2. Use thrift over http
If you use http as your transport, you can include whatever data as you want as http headers, and the thrift content as the body.
This will often require a custom http client for every platform you use to inject the data, and a custom handler on the server to use the data, but neither of those are prohibitively difficult.
3. Hack the protocol
It is possible to create your own custom protocol that wraps another protocol and injects custom data. Take a look at how the multiplexed protocol works in the thrift library for most languages:
c# here. It sends the method name across the wire as service:method. The multiplexed processor unwraps this encoding and passes it on to the appropriate processor.
I have used a similar method to encode arbitrary key/value pairs (like http headers) inside the method name.
The downside to this is that you need to write a more complicated extension for each platform you will be using. Once. It varies a bit from language to language how this works, but it is generally simple enough once you figure it out once.
These are just a few ideas I have had, and I am sure there are others. The nice thing about thrift is how the individual components are decoupled from each other. If you have special needs you can swap any of them out as you need to to add specific functionality.

Are Rack-based web servers represent FastCGI protocol?

I've read that CGI/FastCGI is a protocol for interfacing external applications to web servers.
so the web server (like Apache or NginX) sends environment information and the page request itself to a FastCGI process over a socket and responses are returned by FastCGI to the web server over the same connection, and the web server subsequently delivers that response to the end-user.
Now I'm confused between this and Rack, which is used by almost all Ruby web frameworks and libraries. It provides an interface for developing web applications in Ruby by wrapping HTTP requests and responses.
So, Is Rack-based web-servers like Unicorn, Thin, Passenger or Puma represents the same FastCGI approach? Can I say that Unicorn is a Ruby implementation of FastCGI ?
As you say:
FastCGI is a protocol
Rack is an API
So these are actually two quite different things, though they could
be used together.
FastCGI specifies how two different processes should talk to each other
FastCGI, as a protocol, specifies how two different processes (nominally a web server and an application server or "FastCGI server") should talk to each other over a network connection. The specification defines records of data in a particular format that are sent and received by the two processes.
Exactly what the programs that send and receive these messages look like is not specified, and could be anything. On one side you might have a C program that assembles data in memory and then makes system calls to have the OS send the data, and on the other side you might have a Ruby program that opens a socket, reads in data into Arrays, and then parses those data, and builds a new object encapsulating the request.
Rack specifies what Ruby objects and methods must be made available to higher-level software
On the other hand, Rack, being a Ruby API specification specifies precisely what Ruby objects and methods must be made available to higher-level software implementing some sort of web application, and how those objects and methods must behave, from the point of view of the application. (Don't be confused by the use of the word "protocol" in the document linked above. Here it's used not in the sense of data formats as sent over a communications link, but in the object-oriented programming sense of the conceptual "messages" exchanged between objects to express program behavior, though this is actually at various levels and times implemented as function calls.)
Being an API specification, the user of the Rack API ought at least to behave as if it has no idea what's going on underneath the hood when it calls methods on the various objects an implementation of Rack presents. (Frequently it will have no idea.) It could be the case that the library actually has set up communication with a separate process acting as a web server, via FastCGI or some other protocol, and reads messages from the other process and sends messages back to it, based on what the application using the API implementation does. But on the other hand, you could equally (at least in theory) drop in a completely different implementation of the API that itself has Ruby code to run a web server, and the very same process that ran Ruby code for the web application would be running additional Ruby code to talk the HTTP protocol directly with a client web browser or whatever.
You can't say that Unicorn (or any other implementation of the Rack API) is a "Ruby implementation of FastCGI"
The question does not apply in the way that you asked it, because the whole point of the Rack API specification is that you explicitly avoid thinking about the actual implementation of the services provided through that API. It could well be that some implementations are using FastCGI, but your application should work equally well with one that's not, and you really don't want to care about what's going on underneath the hood.

Faye vs. Socket.IO (and Juggernaut)

Socket.IO seems to be the most popular and active WebSocket emulation library. Juggernaut uses it to create a complete pub/sub system.
Faye is also popular and active, and has its own javascript library, making its complete functionality comparable to Juggernaut. Juggernaut uses node for its server, and Faye can use either node or rack. Juggernaut uses Redis for persistence (correction: it uses Redis for pub/sub), and Faye only keeps state in memory.
Is everything above accurate?
Faye says it implements Bayeux -- i think Juggernaut does not do this -- is that because Juggernaut is lower level (IE, I can implement Bayeux using Juggernaut)
Could Faye switch to using the Socket.IO browser javascript library if it wanted to? Or do their javascript libraries do fundamentally different things?
Are there any other architectural/design/philosophy differences between the projects?
Disclosure: I am the author of Faye.
Regarding Faye, everything you've said is true.
Faye implements most of Bayeux, the only thing missing right now is service channels, which I've yet to be convinced of the usefulness of. In particular Faye is designed to be compatible with the CometD reference implementation of Bayeux, which has a large bearing on the following.
Conceptually, yes: Faye could use Socket.IO. In practise, there are some barriers to this:
I've no idea what kind of server-side support Socket.IO requires, and the requirement that the Faye client (there are server-side clients in Node and Ruby, remember) be able to talk to any Bayeux server (and the Faye server to any Bayeux client) may be deal-breaker.
Bayeux has specific requirements that servers and clients support certain transport types, and says how to negotiate which one to use. It also specifies how they are used, for example how the Content-Type of an XHR request affects how its content is interpreted.
For some types of error handling I need direct access to the transport, for example resending messages when a client reconnects after a Node WebSocket dies.
Please correct me if I've got any of this wrong - this is based on a cursory scan of the Socket.IO documentation.
Faye is just pub/sub, it's just based on a slightly more complex protocol and has a lot of niceties built in:
Server- and client-side extensions
Wildcard pattern-matching on channel routes
Automatic reconnection, e.g. when WebSockets die or the server goes offline
The client works in all browsers, on phones, and server-side on Node and Ruby
Faye probably looks a lot more complex compared to Juggernaut because Juggernaut delegates more, e.g. it delegates transport negotiation to Socket.IO and message routing to Redis. These are both fine decisions, but my decision to use Bayeux means I have to do more work myself.
As for design philosophy, Faye's overriding goal is that it should work everywhere the Web is available and should be absolutely trivial to get going with. I'ts really simple to get started with but its extensibility means it can be customized in quite powerful ways, for example you can turn it into a server-to-client push service (i.e. stop arbitrary clients pushing to it) by adding authentication extensions.
There is also work underway to make it more flexible on the server side. I'm looking at adding clustering support, and making the core pub-sub engine pluggable so you could use Faye as a stateless web frontend for another pub-sub system like Redis or AMQP.
I hope this has been helpful.
AFAIK, yes, apart from the fact Juggernaut only uses Redis for Pubsub, not persistence. Also means client libraries in most languages have already been written (since it just needs a Redis adapter).
Juggernaut doesn't implement Bayeux, but rather has a very simple custom JSON protocol
I don't know, but probably
Juggernaut is very simple, and designed to be that way. Although I haven't used Faye, from the docs it looks like it has a lot more features than just PubSub. Being built on top of Socket.IO has it advantages too, Juggernaut's supported in practically every browser, both desktop and mobile.
I'll be really interested in what Faye's author has to say. As I say, I haven't used it and it would be great to know how it compares to Juggernaut. It's probably the case of using the best tool for the job. If it's pubsub you need, Juggernaut does that very well.
Faye certainly could.
Another example of a similar project on top of Socket.IO:
https://github.com/aaronblohowiak/Push-It

How to structure a client-server application with 'push' notifications

EDIT: I forgot to include the prime candidate for web applications: JSON over HTTP/REST + Comet. It combines the best features of the others (below)
Persevere basically bundles everything I need in a server
The focus for Java and such is definitely on Comet servers, but it can't be too hard to use/write a client.
I'm embarking on an application with a server holding data, and clients executing operations which would affect this data, and thus require some sort of notification across all interested/subscribed clients.
The first client will probably be written in WPF, but we'll probably need to add clients written in other languages, e.g. a Java (Swing?) client, and possibly, a web client.
The actual question(s): What protocol should I use to implement this? How easy would it be to integrate with JS, Java and .NET (precisely, C#) clients?
I could use several interfaces/protocols, but it'd be easier overall to use one that is interoperable. Given that interoperability is important, I have researched a few options:
JSON-RPC
lightweight
supports notifications
The only .NET lib I could find, Jayrock doesn't support notifications
works well with JS
also true of XML-based stuff (and possibly, even binary protocols) BUT this would probably be more efficient, thanks to native support
Protobuf/Thrift
IDL makes it easy to spit out model classes in each language
doesn't seem to support notifications
Thrift comes with RPC out of the box, but protobufs don't
not sure about JS
XML-RPC
simple enough, but doesn't support notifications
SOAP: I'm not even sure about this one; I haven't grokked this yet.
seems rather complex
Message Queues/PubSub approach: Not strictly a protocol, but might be fitting
I hardly know anything about them, and got lost amongst the buzzwords`-- JMS? **MQ?
Perhaps combined with some RPC mechanism above, although that might not be strictly necessary, and possibly, overkill.
Other options are, of course, welcome.
I am partial to the pub/sub design you've suggested. I'd take a look at ZeroMQ. It has bindings to C#, Java, and many other platforms.
Bindings list: http://www.zeromq.org/bindings:clr
I also found this conversation on the ZeroMQ dev listing that may answer some questions you have about multiple clients and ZeroMQ: http://lists.zeromq.org/pipermail/zeromq-dev/2010-February/002146.html
As XMPP was mentioned, SIP has a similar functionality. This might be more accessible for you.
We use Servoy for this. It does automatic data broadcasting to web-clients and java-clients. I'm not sure if broadcasts can be sent to other platforms, you might be able to find an answer to that on their forum.
If you want to easily publish events to clients across networks, you may wish to look at a the XMPP standard. (Used by, amongst other things, Jabber and Google Talk.)
See the extension for publish-subscribe functionality.
There are a number of libraries in different languages including C#, Java and Javascript.
You can use SOAP over HTTP to modify the data on the server and SOAP over SMTP to notify the subscribed clients.
OR
The server doesn't know anything about the subscription and the clients call the server by timeout to track updates they are interested in, using XML-RPC, SOAP (generated using WSDL), or simply HTTP GET if there is no need to pass back complex data on tracking.

Resources