What's the difference between Protocol Buffers and Flatbuffers? - protocol-buffers

Both are serialization libraries and are developed by Google developers. Is there any big difference between them? Is it a lot of work to convert code using Protocol Buffers to use FlatBuffers?

I wrote a detailed comparison of a few serialization systems, including Protobufs and FlatBuffers, here:
https://kentonv.github.io/capnproto/news/2014-06-17-capnproto-flatbuffers-sbe.html
However, the comparison focuses more on comparing the three new "zero-copy" serialization systems, and includes Protobufs mostly as a reference point. Also, I'm the author of Cap'n Proto, and also the author of Protobufs v2 (I was responsible for open sourcing Protobufs at Google), so the comparison may be biased.
(Updated in 2021:) Note that Protobufs was introduced at Google way back in 2001 or so and remains the "lingua franca" there today. FlatBuffers was introduced in 2014 and is used in some projects, but Protobuf remains Google's main data interchange format, and there is no intention to change that. To be fair, Google probably couldn't change this if they wanted to, there's just too much code.

Related

protocol buffers in web application architecture -- when are they not worth the trouble?

I am new to web development, and I've seen many sites preaching the benefits of using protocol buffers -- for example: https://codeclimate.com/blog/choose-protocol-buffers/.
I'm not sure if some of the benefits apply to my use case:
Having a unified schema out of the .proto file: If I validate my data in the front and back-end, which I should do anyway, a unified schema is enforced explicitly. I don't see any added benefit in this regard from using protocol buffers.
Auto generating the setters and getters from the .proto file: This looks like a nice selling point. But, I wouldn't need any setters and getters if I don't use protocol buffers in the first place. I found them really cumbersome to work with:
They remove capitalization, which alters the original variable names
They are unnatural to work with. For example, in c++ I would want work with just a plain old data structure, but instead I have to do something like ptr_message->shouldBeStruct1().shouldBeStructArray(20).shouldBeInt();
Easy Language Interoperability: I really doubt it is good practice to design my data consuming code so that it works for a protobuf message rather than a struct. So, I would need to parse the protobuf into a plain data struct first.
The only potential benefit I see is the reduced data size when transmitting on the wire. But, does this really justify the overhead of additional middleware to work with protocol buffers? What am I missing?

Better Compatible version of Hl7, v2 or v3?

I am going to implement a generic HMIS with true implementation of HL7. I have studied all the advantages and disadvantages of both versions of HL7 i.e v2 and v3. But still the confusion exists that which version is better to go with implementation either it is v2 for its stability or v3 for its plug and play compatibility. Need your opinion.
The HL7 is the organization but also is a set of interoperability standards. It means it is not a function in your system that operates on its own, it is a way that your system communicates with other systems. So the interface that you need to implement in your system - HL7v2 or HL7v3 or HL7 FHIR – is actually dictated by your counterparts.
For example, if you are in US, most likely you'll end up with HL7v2 for messaging, HL7v3 CDA for documents (better know as a separate C-CDA standard) and HL7 FHIR for SMART initiatives. (Let's assume we are not talking about IHE profiles with "v3" suffix.) For Canada and UK it will be the same with the only difference that these countries are using HL7v2 and HL7v3 for messaging.
I would like to answer your question based on Implementation and Data consumption.
HL7v2 is pipe delimited and v3 is of XML, FHIR comes in JSON and XML flavors. Before discussing advantage and disadvantage, it is essential to understand how the end system consumes data. What provision they have, and based on that you can proceed further.
If this question is regarding how efficiently all patient data can be captured in a message format? . I will go with both V2 and V3. V3 is much more standardized, gives more specifications and descriptions. V2 is also has HL7 specific standards for it, if you think that specific message format of yours (ADT/ORU/DFT) lacks specific features to capture, you can use Z-segment or NTE. V3 CDA standards makes sure (upto what I have used), covers most information with its specification itself.
For (Eg:consider CDA standards) Based on the needs CDA can come with its own flavor, as of HL7 standards there are separate Progress notes C-CDA, Procedure notes C-CDA, Transition of Care C-CDA, Diagnostic Imaging Report C-CDA and so on.

Is it possible to parse non-protobuf messages using protobuf?

I am working on a project where we are using protocol buffers to create and parse some of our messages (protobuf-net). This is so elegant, that I would like to use this same deserialization method to parse other messages emanating from external non-protobuf generated sources. Is this possible?
I would imagine that it could be possible to specify all of the .proto fields to be fixed size (i.e. not like variable ints). The question is then whether you could replace the protobuf headers with other magic numbers or whichever header the 3rd party protocol uses.
If this is a bit confusing, an example may shed some light:
Let's say you buy a fancy toaster that exposes an ethernet port. It supports a proprietary but well documented protocol. Can you burn heart shaped patterns on your toast using protobuf?
At the moment, no: the library is tied to the protobuf wire specification; it does not have support for non-protobuf data.
In a way, it is a bit like asking: "can XmlSerializer read/write json?". It isn't something that is on my list of things to look at, to be honest.

gson vs protocol buffer

What are the pros and cons of protocol buffer (protobuf) over GSON?
In what situation protobuf is more appropriate than GSON?
I am sorry for a very generic question.
Both json (via the gson library) and protobuf are portable between platorms; but
protobuf is smaller (bandwidth) and cheaper (CPU) to read/write
json is human readable / editable (protobuf is binary; hard to parse without library support)
protobuf is trivial to merge fragments - just concatenate
json is easily passed to web page clients
the main java version of protobuf needs contract-definition (.proto) and code-generation; gson seems to allow arbitrary pojo usage (there are protobuf implementations that work on such objects, but not for java afaik)
If performance is key : protubuf
For use with a web page (JavaScript), or human readable: json (perhaps via gson)
If you want efficiency and cross-platform you should send raw messages between applications containing the information that is necessary and nothing more or less.
Serialising classes via Java's own mechanisms, gson, protobufs or whatever, creates data that contains not only the information you wish to send, but also information about the logical structures/hierarchies of the data structures that has been used to represent the data inside your application.
This makes those classes and data mapping dual purpose, one to represent the data internally in the application, and two to be transmitted to another application. Those two roles can be conflicting there is an onus on the developer to remember that the classes, collections and data layout he is working with at any time will also be serialised.

Is there a fast and reliable way of serializing objects across different versions of Ruby?

I have two applications talking to each other using a queue, as of now they run exactly the same version of ruby (1.8.7), so I'm just marshaling objects back and forth; only objects from the standard lib mostly hashes, strings, time and date objects.
Right now I'm moving to Ruby 1.9.1, one app at the time, which means I'll be running one app with 1.8.7 and the other with 1.9.1 for a while. By running my tests I know Marshal will not be reliable across versions, I could use YAML, but it is much slower, JSON seems to be faster but it does not deal directly with the date/time objects.
Is there a reliable and fast way to serialize ruby objects across different versions?
I haven't tried it on ruby, but you could look at protocol buffers? Designed as a fast but portable binary format, it has a ruby port here. You would probably have to treat the generated types as a separate DTO layer, though (i.e. you map your existing data into the new types, rather than serialize your existing objects). Note that there is no inbuilt date-time support, but you could just use ticks in an epoch etc.
The key here is finding a common data type that you know will be represented the same across Ruby versions. The obvious choices here are storing data in an external database (the DB interface libraries will handle all the conversions) or writing the data out in a structured text format. If there's not a ton of data to work with (and the data is mostly standard types), I usually just store it as text; it takes longer to export/import but it's usually faster to write.
Protobufs are good, but require you to pre-define your data structures, if I recall. Thrift is similar to protobufs, but has some decent code generation features.
Apple's binary property list format sounds close to what you need. It's similar to JSON in behavior, but is more compact and supports a few extra types, including datetime and unencoded binary. There are a couple ruby implementations on github.
Your best bet may be BERT. BERT is based on Erlang's binary term serialization format. It's compact, includes datatime serialization and is implemented in a dozen or so languages, including ruby.

Resources