Can simple protobuf types be migrated to "optional" - protocol-buffers

In protobuf version 3 required and optional keywords first have been removed, since required often caused problems protobuf issue 2497.
Recently the 'optional' keyword has been reintroduced protobuf v3.15.0.
Is it possible to simply add the optional keyword to an existing message?
I.e. change
message Test {
int32 int32_value = 1;
string text_value = 2;
}
to
message Test {
optional int32 int32_value = 1;
optional string text_value = 2;
}
Or will this break the binary format?

non-optional primitive types in protobuf don't accept null-values and normally also map to non-nullable types like int in Java or C#.
But this doesn't mean, that the field is always included in the binary representation.
In fact, if a field contains the default value for the corresponding type the field is omitted in the binary representation.
Thus the following message
message Test {
int32 int32_value = 1;
string text_value = 2;
}
Test test = new Test();
byte[] buffer = test.ToByteArray();
gets serialized to buffer containing an empty byte[].
So missing fields default to default values without the use of optional.
If the optional keyword is changing the behaviour for missing fields in the binary format and for default values specified:
Missing fields indicate the field has not been specified and indicate null. Setting default values will not result in an empty byte[] but in the default values being serialized.
Thus changing a primitive field to optional won't break the format, but will change the semantics:
All fields of old messages that have been specified with the default value will be interpreted as null. Other values are not affected.
The same for optional being removed from a field:
The api won't break, but change semantics. Unspecified fields will then default to default values for the corresponding type.

Related

grpc and protobuf - How to handle a new field when the other side is not releasing in sync

I've got a situation where the other end of the grpc communication is not in sync with their releases. My higher ups, would like me to therefore add a field that is going to work if the other side does or doesn't fill it out, for a short time period (like two weeks)
I believe I can do this by adding it to the end of the proto message such that the indices for the other fields do not change. From, what I've Googled, the optional field is not avail prior to version 3.15, so I have to use a work around.
The workaround that was described to me was to use oneof. However, I am not 100% sure what that looks like. All examples show the oneof field by itself. Are the indices that belong to the oneof values indendent of the indices that belong to the rest of the message?
message TestMessage {
string somefield = 1;
int someotherfield = 2;
oneof mynewoptionalfield
{
string mynewfield = ???? Does this have to be 3 or is it 1?
int ifihadanother = ???? Does this need to be 4 or 2?
}
}
Questions:
What are the indices I use where the ??? marks are
Is this the proper work around to use when the other side isn't going to recompile and deploy with the changes to the protofile?
How do I then check if the field was filled in my C++ code?
Your use-case is exactly what protobufs were designed to handle. All you need to do is: add a new field to the message. In the easiest case, the client application code simply doesn't look at the new field until the server roll-out is complete and so doesn't notice sometimes it is present and other times missing.
You are correct that you should not change the indices (field ids) of the pre-existing fields. Although I'll note that you can add your new field anywhere within the message; the order the fields are written in does not matter for protobuf.
So you'd just add another field like:
message TestMessage {
string somefield = 1;
int someotherfield = 2;
string mynewfield = 3;
}
You don't have to use 3 as the id. You could use 4, or 10, or 10000. But small numbers are more efficient for protobuf and it is typical to just choose the "next" id. On-the-wire protobuf uses the id to identify the field, so it is important you don't change the id later.
In protobuf 3, all fields are "optional" in the protobuf 2 sense; there are no "required" fields. However, protobuf 2 also provided "field presence" for all fields. Protobuf 3 only provided field presence for oneofs and messages... until the recent re-introduction of the "optional" keyword.
In protobuf 3 if you call textMessage.getMynewfield() it will always return a non-null string. If the string was not sent, it will use the empty string (""). For integers 0 is returned and for messages the "default message" (all defaults) is returned. This is plenty for many use-cases, and may be enough for you.
But let's say you need to distinguish between "" and <notsent>. That's what field presence provides. Messages in protobuf 3 have "has" methods that return true if a value is present. But primitives don't have that presence information. One option is to "box" the primitive with standard wrappers that make the primitive a message. Another option available in newer versions of protobuf is the optional keyword. Both options will provide a method like textMessage.hasMynewfield().
message TestMessage {
string somefield = 1;
int someotherfield = 2;
google.protobuf.StringValue mynewfield = 3;
// -or-
optional string mynewfield = 3;
}

Stopping omission of default values in Protocol Buffers

I have a proto schema defined as below,
message User {
int64 id = 1;
bool email_subscribed = 2;
bool sms_subscribed = 3;
}
Now as per official proto3 documentation, default values are not serialized to save space during wire transmission. But in my case I want to receive whether the client has explicitly set true/false for fields email_subscribed/sms_subscribed (because the values were true before but now the user wants to unsubscribe). Hence, when the client sends false for any of these fields, the generator code serializer just omits these fields.
How do I achieve this and avoid the omission of these fields for the above scenario?
PS: I am using Javascript as my GRPC client and Python and GRPC Server.
Update: this has changed recently with the re-introduction of presence tracking info proto3 via a new meaning of the optional keyword:
message User {
optional int64 id = 1;
optional bool email_subscribed = 2;
optional bool sms_subscribed = 3;
}
With this change (now available in protoc etc), explicit assignment is transmitted even if it is the implicit default value.
You cannot under proto3. Your best bet is probably to define a tri-bool enum with not-specified as the first item with value zero, and some true / false values after that.
This will require the same space as a protobuf bool, but will not be binary compatible - so you cannot simply change the declared member type on existing messages. Well, I guess if you make true === 1, then at least that still works - and for the transition you'd have to anticipate false / not specified being ambiguous until you've flushed any old data.
The other option is to add a bool fooSpecified member for every bool foo, but that takes more space and is error-prone due to being manual.
Another option will be to use wrappers with proto3. They basically wrap your value in a message so on the parent message it can be left null.
This way you can differentiate null / false / true on your bool field with a some extra work.

How to ask protoc to use a value instead of a pointer as the value side of a map with Go?

I am using protocol buffers defined like this:
message Index {
message albums {
repeated string name = 1;
}
map<string, albums> artists_albums= 1;
map<int32, albums> year_albums = 2;
}
It generates go code like this:
type Index struct {
ArtistsAlbums map[string]*IndexAlbums
YearAlbums map[int32]*IndexAlbums
}
How can I make it generate map values of type IndexAlbums instead of *IndexAlbums?
If you use gogoprotobuf then there is an extension that will allow that
map<string, albums> artists_albums = 1 [(gogoproto.nullable) = false];
With regular goprotobuf I don't believe there is a way.
nullable, if false, a field is generated without a pointer (see warning below).
Warning about nullable: According to the Protocol
Buffer specification, you should be able to tell whether a field is
set or unset. With the option nullable=false this feature is lost,
since your non-nullable fields will always be set. It can be seen as a
layer on top of Protocol Buffers, where before and after marshalling
all non-nullable fields are set and they cannot be unset.

Protocol buffer: does changing field name break the message?

With protocol buffer, does changing field name of a message still let it compatible backward? I couldn't find any cite about that.
Eg: original message
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}
Change to:
message Person {
required string full_name = 1;
required int32 id = 2;
optional string email = 3;
}
Changing a field name will not affect the protobuf encoding or compatibility between applications that use proto definitions which differ only by field names.
The binary protobuf encoding is based on tag numbers, so that is what you need to preserve.
You can even change a field type to some extent (check the type table at https://developers.google.com/protocol-buffers/docs/encoding#structure) providing its wire type stays the same, but that requires additional considerations whether, for example, changing uint32 to uint64 is safe from the point of view of your application code and, for some definition of 'better', is better than simply defining a new field.
Changing a field name will affect json representation, if you use that feature.

how to give default values in protocol buffer?

message Person {
required Empid = 1 [default = 100];
required string name = 2 [default = "Raju"];
optional string occupation = 3;
repeated string snippets = 4;
}
Can I give the default values as mentioned above?
For proto3, custom default values are disallowed.
Update: The below answer is for proto2 only, proto3 doesn't allow custom default values.
Yes, you can give default values as you had written. default is optional for required, but for optional you have to mention the default values else type specific value is automatically assigned. Moreover you forgot to mention the type for Empid.
protobuf language guide states that
If the default value is not specified for an optional element, a
type-specific default value is used instead: for strings, the default
value is the empty string. For bools, the default value is false. For
numeric types, the default value is zero. For enums, the default value
is the first value listed in the enum's type definition. This means
care must be taken when adding a value to the beginning of an enum
value list.

Resources