Stopping omission of default values in Protocol Buffers

Stopping omission of default values in Protocol Buffers - protocol-buffers

I have a proto schema defined as below,
message User {
int64 id = 1;
bool email_subscribed = 2;
bool sms_subscribed = 3;
}
Now as per official proto3 documentation, default values are not serialized to save space during wire transmission. But in my case I want to receive whether the client has explicitly set true/false for fields email_subscribed/sms_subscribed (because the values were true before but now the user wants to unsubscribe). Hence, when the client sends false for any of these fields, the generator code serializer just omits these fields.
How do I achieve this and avoid the omission of these fields for the above scenario?
PS: I am using Javascript as my GRPC client and Python and GRPC Server.

Update: this has changed recently with the re-introduction of presence tracking info proto3 via a new meaning of the optional keyword:
message User {
optional int64 id = 1;
optional bool email_subscribed = 2;
optional bool sms_subscribed = 3;
}
With this change (now available in protoc etc), explicit assignment is transmitted even if it is the implicit default value.
You cannot under proto3. Your best bet is probably to define a tri-bool enum with not-specified as the first item with value zero, and some true / false values after that.
This will require the same space as a protobuf bool, but will not be binary compatible - so you cannot simply change the declared member type on existing messages. Well, I guess if you make true === 1, then at least that still works - and for the transition you'd have to anticipate false / not specified being ambiguous until you've flushed any old data.
The other option is to add a bool fooSpecified member for every bool foo, but that takes more space and is error-prone due to being manual.

Another option will be to use wrappers with proto3. They basically wrap your value in a message so on the parent message it can be left null.
This way you can differentiate null / false / true on your bool field with a some extra work.

Related

How to handle a change in the interpretation of a field in a protobuf message?

If a field stores a specific value and is interpreted in a specific manner, is it possible to change this interpretation in a backwards compatible way?
Let's say I have a field that stores values of different data types.
The most generic case is to store it as a byte array and let the apps encode and decode it to the correct data type.
Common cases for data types are integers and strings, so support for those types is present.
Using a oneof structure this looks as follows:
message Foo
{
...
oneof value
{
uint32 integer = 1;
string text = 2;
bytes data = 3;
}
}
Applications that want to store an ip prefix in the value field, have to use the generic data field and do the encoding and decoding correctly.
If I now want to add support for ip prefixes to the Foo message itself so the apps don't have to deal with the encoding and decoding anymore, I could add a new field to the oneof structure with an IpPrefix datatype:
message Foo
{
...
oneof value
{
uint32 integer = 1;
string text = 2;
bytes data = 3;
IpPrefix ip_prefix = 4;
}
}
Even though this makes life easier for the apps, I believe it breaks backwards compatibility.
If a sending app has support for the new field, it will put its ip prefix value in the ip_prefix field.
But if a receiving app does not have support for this new field yet, it will ignore the field.
It will look for the ip prefix value in the data field, as it always did, but it won't find it there.
So the receiving app can no longer correctly read the ip prefix value anymore.
Is there a way to make this scenario somehow backwards compatible?
PS: I realize this is a somewhat vague and perhaps unrealistic example. The exact case I need it for is for the representation of RADIUS attributes in a protobuf message. These attributes are in essence a byte array that is sent over the network, but the bytes in the array have meaning and could be stored as different fields in the protobuf message. A basic attribute exists of a Type field and a Value field where the value field can be a string, integer, ip address... From time to time new datatypes (even complex ones) are added and I would like to be able to add new datatypes in a backwards compatible way.

There are two ways to go about this:
1. Enforce an update schedule, readers before writers
Add the new type of field to the .proto definition, but document that it should not be used except for testing and reception. Document that all readers of the message must support both the old and the new field by a specific milestone/date, after which the writers can start using it. Eventually you can deprecate the old field and new readers don't need to support it anymore.
2. Have both fields during the transition period
message Foo
{
...
oneof value
{
uint32 integer = 1;
string text = 2;
bytes data = 3;
}
IpPrefix ip_prefix = 4;
}
Document that writers should set both data and ip_prefix during the transition period. The readers can start using ip_prefix as soon as writers have added support, and can optionally fall back to data.
Later, you can deprecate data and move ip_prefix to inside the oneof without breaking compatibility.

grpc and protobuf - How to handle a new field when the other side is not releasing in sync

I've got a situation where the other end of the grpc communication is not in sync with their releases. My higher ups, would like me to therefore add a field that is going to work if the other side does or doesn't fill it out, for a short time period (like two weeks)
I believe I can do this by adding it to the end of the proto message such that the indices for the other fields do not change. From, what I've Googled, the optional field is not avail prior to version 3.15, so I have to use a work around.
The workaround that was described to me was to use oneof. However, I am not 100% sure what that looks like. All examples show the oneof field by itself. Are the indices that belong to the oneof values indendent of the indices that belong to the rest of the message?
message TestMessage {
string somefield = 1;
int someotherfield = 2;
oneof mynewoptionalfield
{
string mynewfield = ???? Does this have to be 3 or is it 1?
int ifihadanother = ???? Does this need to be 4 or 2?
}
}
Questions:
What are the indices I use where the ??? marks are
Is this the proper work around to use when the other side isn't going to recompile and deploy with the changes to the protofile?
How do I then check if the field was filled in my C++ code?

Your use-case is exactly what protobufs were designed to handle. All you need to do is: add a new field to the message. In the easiest case, the client application code simply doesn't look at the new field until the server roll-out is complete and so doesn't notice sometimes it is present and other times missing.
You are correct that you should not change the indices (field ids) of the pre-existing fields. Although I'll note that you can add your new field anywhere within the message; the order the fields are written in does not matter for protobuf.
So you'd just add another field like:
message TestMessage {
string somefield = 1;
int someotherfield = 2;
string mynewfield = 3;
}
You don't have to use 3 as the id. You could use 4, or 10, or 10000. But small numbers are more efficient for protobuf and it is typical to just choose the "next" id. On-the-wire protobuf uses the id to identify the field, so it is important you don't change the id later.
In protobuf 3, all fields are "optional" in the protobuf 2 sense; there are no "required" fields. However, protobuf 2 also provided "field presence" for all fields. Protobuf 3 only provided field presence for oneofs and messages... until the recent re-introduction of the "optional" keyword.
In protobuf 3 if you call textMessage.getMynewfield() it will always return a non-null string. If the string was not sent, it will use the empty string (""). For integers 0 is returned and for messages the "default message" (all defaults) is returned. This is plenty for many use-cases, and may be enough for you.
But let's say you need to distinguish between "" and <notsent>. That's what field presence provides. Messages in protobuf 3 have "has" methods that return true if a value is present. But primitives don't have that presence information. One option is to "box" the primitive with standard wrappers that make the primitive a message. Another option available in newer versions of protobuf is the optional keyword. Both options will provide a method like textMessage.hasMynewfield().
message TestMessage {
string somefield = 1;
int someotherfield = 2;
google.protobuf.StringValue mynewfield = 3;
// -or-
optional string mynewfield = 3;
}

Protocol buffer zero value for integer

I have a Go struct what we are using currently in our restful API which looks like this:
type Req struct {
Amount *int
}
I'm using pointer here, because if the Amount is nil, it means the Amount was not filled, if the Amount isn't nil, but it's zero, it means the field was filled, but the value is zero.
When we started to change to protofiles and we want to use it like, the main API get's the request as HTTP API and send it to the next service through gRPC with the same protofile I faced with the issue, the proto3 can't generate pointer for the Amount. That's fine because the protocol buffers designed for the purpose of sending data between separated systems, but how can I handle the issue above, because if I get the request I can't decide that the Amount is nil or just zero.

proto3 doesn't distinguish between zero and absent; the concepts of defaults and implicit vs explicit values disappeared:
the default value is always zero (or false, etc)
if the value is zero, it isn't sent; otherwise, it is
What you're after is more possible with proto2. Alternatively, just add a separate field to indicate that you have a value for something:
message Req {
int amount = 1;
bool amountHasValue = 2;
}
Or use a nested sub-message, i.e.
message Foo {
Bar bar = 1;
}
message Bar {
int amount = 1;
}
(so; without a value you just send a Foo; with a value, you send a Foo with a Bar, and whatever the amount is: it is)

How to ask protoc to use a value instead of a pointer as the value side of a map with Go?

I am using protocol buffers defined like this:
message Index {
message albums {
repeated string name = 1;
}
map<string, albums> artists_albums= 1;
map<int32, albums> year_albums = 2;
}
It generates go code like this:
type Index struct {
ArtistsAlbums map[string]*IndexAlbums
YearAlbums map[int32]*IndexAlbums
}
How can I make it generate map values of type IndexAlbums instead of *IndexAlbums?

If you use gogoprotobuf then there is an extension that will allow that
map<string, albums> artists_albums = 1 [(gogoproto.nullable) = false];
With regular goprotobuf I don't believe there is a way.
nullable, if false, a field is generated without a pointer (see warning below).
Warning about nullable: According to the Protocol
Buffer specification, you should be able to tell whether a field is
set or unset. With the option nullable=false this feature is lost,
since your non-nullable fields will always be set. It can be seen as a
layer on top of Protocol Buffers, where before and after marshalling
all non-nullable fields are set and they cannot be unset.

With Protocol Buffers, is it safe to move enum from inside message to outside message?

I've run into a use case where I'd like to move an enum declared inside a protocol buffer message to outside the message so that other messages van use the same Enum.
ie, I'm wondering if there are any issues moving from this
message Message {
enum Enum {
VALUE1 = 1;
VALUE2 = 2;
}
optional Enum enum_value = 1;
}
to this
enum Enum {
VALUE1 = 1;
VALUE2 = 2;
}
message Message {
optional Enum enum_value = 1;
}
Would this cause any issues de-serializing data created with the first protocol buffer definition into the second?

It doesn't change the serialization data at all - the location / name of the enums are irrelevant for the actual data, since it just stores the integer value.
What might change is how some languages consume the enum, i.e. how they qualify it. Is it X.Y.Foo, X.Foo, or just Foo. Note that since enums follow C++ naming/scoping rules, some things (such as conflicts) aren't an issue: but it may impact some languages as consumers.
So: if you're the only consumer of the .proto, you're absolutely fine here. If you have shared the .proto with other people, it may be problematic to change it unless they are happy to update their code to match any new qualification requirements.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio