protobuf oneof backwards compatibility - protocol-buffers

If I had some protobufs created with the following protobuf schema
message Foo {
Bar1 bar_1 = 1;
Bar2 bar_2 = 2;
}
but later on updated the protobuf schema to
message Foo {
oneof foo {
Bar1 bar_1 = 1;
Bar2 bar_2 = 2;
}
}
Will this second version be able to read the protos created with the first version?

I don't think it can.
A Foo message created with the first version of the schema can contain both a bar_1 and a bar_2.
Code generated from the second schema is expecting there to be only a bar_1 or only a bar_2, so regardless of whatever markers GPB puts in its wireformat to denote a oneof, this code wouldn't know what to do with the surplus bar_2.
It's possible that, with the right schema syntax version (is it v3 makes everything optional?) that a message created using code for the first schema that contains only a bar_1 or bar_2 (made possible by the fields being optional) may be parsable by code generated from the second schema. But that would come down to how oneof is treated in the GPB wireformat.
All in all, it's best to not make assumptions about wireformat compatibility between conflicting schemas. It'd be easy to write a small utility to read Foo messages created by the first schema, check to see if there are 2 fields pressent, and create fresh Foo messages under the 2nd schema if there are not (you may have to compile the schema with appropriately diverse namespaces configured). That way you can catch the exceptions (2 fields present), and be sure to have ended up with compatible wireformat data.

Yes, the second version of the protobuf schema should be able to read protobufs created with the first version. When you update a protobuf schema, the changes you make only affect how new protobufs are encoded and decoded. Protobufs that were created with the previous version of the schema will still be encoded and decoded using the old schema. This means that the second version of the schema should still be able to read protobufs created with the first version, even though the schema has changed.
However, it is worth noting that when you make changes to a protobuf schema, you should take care to ensure that the changes are backward-compatible. This means that the new schema should still be able to read protobufs created with the old schema, without losing any information. In the example you provided, the change from the first version of the schema to the second is backward-compatible, so the second version should be able to read protobufs created with the first version. However, if you made a change that was not backward-compatible, the second version of the schema would not be able to read protobufs created with the first version.

Related

What is the purpose of RocksDBStore with Serdes.Bytes() and Serdes.ByteArray()?

RocksDBStore<K,V> stores keys and values as byte[] on disk. It converts to/from K and V typed objects using Serdes provided while constructing the object of RocksDBStore<K,V>.
Given this, please help me understand the purpose of the following code in RocksDbKeyValueBytesStoreSupplier:
return new RocksDBStore<>(name,
Serdes.Bytes(),
Serdes.ByteArray());
Providing Serdes.Bytes() and Serdes.ByteArray() looks redundant.
RocksDbKeyValueBytesStoreSupplier is introduced in KAFKA-5650 (Kafka Streams 1.0.0) as part of KIP-182: Reduce Streams DSL overloads and allow easier use of custom storage engines.
In KIP-182, there is the following sentence :
The new Interface BytesStoreSupplier supersedes the existing StateStoreSupplier (which will remain untouched). This so we can provide a convenient way for users creating custom state stores to wrap them with caching/logging etc if they chose. In order to do this we need to force the inner most store, i.e, the custom store, to be a store of type <Bytes, byte[]>.
Please help me understand why we need to force custom stores to be of type <Bytes, byte[]>?
Another place (KAFKA-5749) where I found similar sentence:
In order to support bytes store we need to create a MeteredSessionStore and ChangeloggingSessionStore. We then need to refactor the current SessionStore implementations to use this. All inner stores should by of type < Bytes, byte[] >
Why?
Your observation is correct -- the PR implementing KIP-182 did miss to remove the Serdes from RocksDBStore that are not required anymore. This was fixed in 1.1 release already.

Data abstraction in API Blueprint + Aglio?

Reading the API Blueprint specification, it seems set up to allow one to specify 'Data Structures' like:
Address
street: 100 Main Str. (string) - street address
zip: 77777-7777 (string) - zip / postal code
...
Customer:
handle: mrchirpy (string)
address: (address)
And then in the model, make a reference to the data structure:
Model
[Customer][]
It seems all set up that by referencing the data structure it should generate documentation and examples in-line with the end points.
However, I can't seem to get it to work, nor can I find examples using "fully normalized data abstraction". I want to define my data structures once, and then reference everywhere. It seems like it might be a problem with the tooling, specifically I'm using aglio as the rendering agent.
It seems like all this would be top of the fold type stuff so I'm confused and wondering if I'm missing something or making the wrong assumptions about what's possible here.
#zanerock, I'm the author of Aglio. The data structure support that you mention is a part of MSON, which was recently added as a feature to API Blueprint to describe data structures / schemas. Aglio has not yet been updated to support this, but I do plan on adding the feature.

Validate against schema fragment

I'm new to json-schema so it might not be a relevant issue.
I'm using https://github.com/hoxworth/json-schema.
I have one big json file describing a lot of schemas (mostly small ones) with a lot $ref between schemas, and I need to be able to validate data against one of these "inner" schemas. I can't find a way to do this with json-schema.
Does json-schema support this use case, or am I doing it wrong ?
It appears it does. It states that it uses the json schema v4. Also in the source code: line 265 lib/json-schema/validator.rb.
def build_schemas(parent_schema)
# Build ref schemas if they exist
if parent_schema.schema["$ref"]
load_ref_schema(parent_schema, parent_schema.schema["$ref"])
end

distinguish use cases in NSAutosaveElsewhereOperation

I try to add AutoSave support to the Core Data File Wrapper example
Now if i have a new/untitled document writeSafelyToURL is called with the NSAutosaveElsewhereOperation type.
The bad thing is, I get this type in both typical use cases
- new file: which store a complete new document by creating the file wrapper and the persistent store file
- save diff: where the file wrapper already exists and only an update is required.
Does somebody else already handled this topic or did somebody already migrated this?
The original sample use the originalStoreURL to distinguish those two use cases, which solution worked best for you?
Thanks

How does protocol buffer handle versioning?

How does protocol buffers handle type versioning?
For example, when I need to change a type definition over time? Like adding and removing fields.
Google designed protobuf to be pretty forgiving with versioning:
unexpected data is either stored as "extensions" (making it round-trip safe), or silently dropped, depending on the implementation
new fields are generally added as "optional", meaning that old data can be loaded successfully
however:
do not renumber fields - that would break existing data
you should not normally change the way any given field is stored (i.e. from a fixed-with 32-bit int to a "varint")
Generally speaking, though - it will just work, and you don't need to worry much about versioning.
I know this is an old question, but I ran into this recently. The way I got around it is using facades, and run-time decisions to serialize. This way I can deprecate/upgrade a field into a new type, with old and new messages handling it gracefully.
I am using Marc Gravell's protobuf.net (v2.3.5), and C#, but the theory of facades would work for any language and Google's original protobuf implementation.
My old class had a Timestamp of DateTime which I wanted to change to include the "Kind" (a .NET anachronism). Adding this effectively meant it serialized to 9 bytes instead of 8, which would be a breaking serialization change!
[ProtoMember(3, Name = "Timestamp")]
public DateTime Timestamp { get; set; }
A fundamental of protobuf is NEVER to change the proto ids! I wanted to read old serialized binaries, which meant "3" was here to stay.
So,
I renamed the old property and made it private (yes, it can still deserialize through reflection magic), but my API no longer shows it useable!
[ProtoMember(3, Name = "Timestamp-v1")]
private DateTime __Timestamp_v1 = DateTime.MinValue;
I created a new Timestamp property, with a new proto id, and included the DateTime.Kind
[ProtoMember(30002, Name = "Timestamp", DataFormat = ProtoBuf.DataFormat.WellKnown)]
public DateTime Timestamp { get; set; }
I added a "AfterDeserialization" method to update our new time, in the case of old messages
[ProtoAfterDeserialization]
private void AfterDeserialization()
{
//V2 Timestamp includes a "kind" - we will stop using __Timestamp - so keep it up to date
if (__Timestamp_v1 != DateTime.MinValue)
{
//Assume the timestamp was in UTC - as it was...
Timestamp = new DateTime(__Timestamp_v1.Ticks, DateTimeKind.Utc) //This is for old messages - we'll update our V2 timestamp...
}
}
Now, I have the old and new messages serializing/deserializing correctly, and my Timestamp now includes DateTime.Kind! Nothing broken.
However, this does mean that BOTH fields will be in all new messages going forward. So the final touch is to use a run-time serialization decision to exclude the old Timestamp (note this won't work if it was using protobuf's required attribute!!!)
bool ShouldSerialize__Timestamp_v1()
{
return __Timestamp_v1 != DateTime.MinValue;
}
And thats it. I have a nice unit test which does it from end-to-end if anyone wants it...
I know my method relies on .NET magic, but I reckon the concept could be translated to other languages....

Resources