Oneof kind vs Enum in protobuf - protocol-buffers

What's the difference between using an Enum and a oneof kind in protobuf3? As far as I can tell, an Enum restricts the field to be one of a predefined set of values, but so does the oneof kind.

Enums are named numbers. You define the names in the enum definition and assign them a value. An enum should always have the value zero it it's set.
enum State {
A = 0;
B = 1;
C = 2;
}
Next you can use this enum in any of your message
message Update {
State currentState = 1;
State previousState = 2;
}
A oneof is something very different. It allows you to send different types but only allocate limited memory for them. This as you can only set one of those types at a time. This is similar to an union in C/C++ or a std::variant as of C++17.
Take this example in which we have a message, a integer and double defined in our oneof.
// The message in our oneof
message someMsg {
// Multiple fields
}
// The message holding our oneof
message msgWithOneof {
oneof theOneof {
someMsg msg = 1;
int32 counter = 2;
double value = 3;
}
// Feel free to add more fields her of before the oneof
}
You are only able to set msg, counter or value at one time. If you set another this will clear the other field.
Assuming an C/C++ implementation the largest field will determine the amount of memory allocated. Say someMsg is the largest, setting a integer or double will not be a problem as they fit in the same space. If you not use a oneof the total memory allocated would be the sum of sizeof(someMsg) + sizeof(int32) + sizeof(double).
There is some overhead to keep track which field has been set. In the google C++ implementation this is a bit in the presence variable. This is similar to fields which are marked optional.

Related

Protobuf use repeated oneof message vs many empty fields

I'm designing a protobuf to represent an event, where each event can hold extra fields.
There are a lot of possible extra fields (~100), but only a small portion of them will be used in each message (~3)
Each extra field will be used only once, but multiple of them can exist, therefore I would like to have a concept of an anyof message type, but unfortunately, there is no such thing in protobuf.
So to try and mock this behavior, and as mentioned in this discussion I thought I can put all my extra fields in a oneof, wrap it with a message, and use this message as repeated in my event:
message ExtraField {
oneof extra_field_value {
string extraData1 = 1;
uint64 extraData2 = 2;
....
SomeOtherMessage extraData100 = 100;
}
}
message MyEvent {
uint64 timestamp = 1;
string event_name = 2;
string some_other_data = 3;
...
repeated ExtraField extra_fields = 8;
}
Even though this solution is more explicit for my understanding, it isn't the most memory effective, and the repeated message with oneof implementation allows to add the same extra field more than once (unwanted behavior)
I can also just write all the extra fields as-is in an inner message, but most of them will be empty all the time
message ExtraFields {
string extraData1 = 1;
uint64 extraData2 = 2;
....
SomeOtherMessage extraData100 = 100;
}
message MyEvent {
uint64 timestamp = 1;
string event_name = 2;
string some_other_data = 3;
...
extraFields extra_fields = 8;
}
If I understand correctly, using empty fields in a message isn't going to make my serialized data larger, and therefore the second protobuf design is the preferred practice
Am I correct?
Is there another protobuf design for my needs?

Don't know how to reflection oneof filed

I have a protobuf message defined something like this
message Foo
{
oneof test //oneof field
{
int32 a = 1;
MM b = 2;
}
}
message MM
{
string str =1;
}
how i reflect of oneof filed in protobuf
For the most part, you handle a message with oneof the same way you would a message without:
message Foo
{
int32 a = 1;
MM b = 2;
}
Oneof is largely transparent to reflection, and doesn't affect the wire format. Its effect is that the generated setter code automatically clears other members of a oneof whenever one is set.
Now, if you care about oneofs for some reason, there's Descriptor::oneof_decl that allows you to enumerate them, Descriptor::FindOneofByName, and FieldDescriptor::containing_oneof (if you are working your way from the field up). With OneofDescriptor in hand, you can find its name and enumerate its fields, and that's pretty much it.

Encoding repeated entries in pbtools

I have a protobuf schema with a bunch of repeated structures. Something like
syntax = "proto3";
package My
message TopLevel
{
string swVersion = 3;
string reportMac = 4;
string reportSsid = 6
}
message Temperature
{
required uint64 ts = 1;
required uint32 source = 3;
repeated sint32 readings = 4;
}
message MyMessage
{
required TopLevel topLevel = 1;
repeated Temperature temperature = 2;
}
I compile with pbtools and get the structures and functions for Temperature and readings. However I am having a hard time figuring out how to add "Temperature" entries dynamically.
Or am I out of luck and pbtools requires telling it ahead of time how many entries I have. One problem is data is encoded as it is generated and I do not know how many of what I will have for each report.
I attached the generated code.
pbtools requires the length before adding any items.

How to create object from repeated type protobuf

What I am looking for is a function that returns the message of a repeated field
I know there is Reflection::AddMessage which has the return type that I want but I do not want to add a message, just return an object of that message.
Here is an example of what I am trying to do let's say I have in the .proto file a message:
message Bar{
uint32 t x = 1;
uint64 t y = 2;
}
message Foo{
repeated Bar myMessage = 1;
}
I am using reflection to iterate through the Foo message and I want to be able to do something like this:
Message* Msg = createMessage(refl->FooMsg, FieldDesc)
I know there is also GetRepeatedMessage but that requires index.
First of all when the protobuf compiler generates the code for compiling you get an accessor function in the interface. The are functions mutable_nameOf_message() which returns the entire repeated field which is a std::vector in c++, or mutable_nameOf_message( index ) which gives you the specified element.
Now if you do not want to use Bar then you d'not need too.
message ArrayOfBar
{
repeated Bar arrayOfBar = 0;
message Bar{
uint32 t x = 1;
uint64 t y = 2;
}
}
If thats what you have hade in mind you could also be do something like this.
std::vector<Bar> arrayOfBars;
But that idea needs refinement because of the internal specifics of the Protobuf. Some unwanted behavior might occur with something like that.

protobuf.serializer.serialize equivalnce in C++

I am writing adapter class (library function) which will take different kind of PB messages as the input in the form of std::Map and serialize this map write in to the file then vice versa.
Example:
message user_defined_type
{
optional int Val1 = 1;
optional string Val2 = 2;
}
message Store
{
optional int Key = 1;
optional user_defined_type Value = 2;
}
The client will create std::Map and stores the above message (i.e., std::map XYZ). The library takes the std::Map as input and does serializing the message and store it in to the file. But the library don't have/know the Proto message definitions.
To achieve the above came up with an approach, the library will have intermediate proto message which has both the fields are byte type
message MAP
{
optional byte KeyField = 1;
optional byte ValueField = 2;
}
Such that the KeyField takes has value of Store::Key and ValueField has the value of Store::user_defined_type so the serialization and de-serialization will be generic for all type of messages.
In C# using the protobuf.serializer.serialize I can serialize/de-serialize to the designated type but in C++ don't know how to make it, any help/pointer much appreciated.
If I understand correctly, the challenge is that your library needs to know how to parse ValueField (and perhaps KeyField) but the library itself does not know their types; only the caller of the library does.
The way to solve this is to have the caller pass in a "prototype" instance of their message type, from which you can spawn additional instances:
map<string, Message*> parse(string data, const Message& prototype) {
map<string, Message*> result;
MapProto proto;
proto.ParseFromString(data);
for (int i = 0; i < proto.entry_size(); i++) {
Message* value = prototype->New();
value->ParseFromString(proto.entry(i).value());
result[proto.entry(i).key()] = value;
}
return result;
}
The caller would call this function like:
map<string, Message*> items = parse(data, MyValueType::default_instance());
Then, each message in the returned map will be an instance of MyValueType (the caller can use static_cast to cast the pointer to that type). The trick is that we had the caller pass in the default instance of their type, and we called its New() method to construct additional instances of the same type.

Resources