Protobuf use repeated oneof message vs many empty fields

Protobuf use repeated oneof message vs many empty fields - protocol-buffers

I'm designing a protobuf to represent an event, where each event can hold extra fields.
There are a lot of possible extra fields (~100), but only a small portion of them will be used in each message (~3)
Each extra field will be used only once, but multiple of them can exist, therefore I would like to have a concept of an anyof message type, but unfortunately, there is no such thing in protobuf.
So to try and mock this behavior, and as mentioned in this discussion I thought I can put all my extra fields in a oneof, wrap it with a message, and use this message as repeated in my event:
message ExtraField {
oneof extra_field_value {
string extraData1 = 1;
uint64 extraData2 = 2;
....
SomeOtherMessage extraData100 = 100;
}
}
message MyEvent {
uint64 timestamp = 1;
string event_name = 2;
string some_other_data = 3;
...
repeated ExtraField extra_fields = 8;
}
Even though this solution is more explicit for my understanding, it isn't the most memory effective, and the repeated message with oneof implementation allows to add the same extra field more than once (unwanted behavior)
I can also just write all the extra fields as-is in an inner message, but most of them will be empty all the time
message ExtraFields {
string extraData1 = 1;
uint64 extraData2 = 2;
....
SomeOtherMessage extraData100 = 100;
}
message MyEvent {
uint64 timestamp = 1;
string event_name = 2;
string some_other_data = 3;
...
extraFields extra_fields = 8;
}
If I understand correctly, using empty fields in a message isn't going to make my serialized data larger, and therefore the second protobuf design is the preferred practice
Am I correct?
Is there another protobuf design for my needs?

Related

How to encode a repeated google.protobuf.any?

I have a message and I would like to package it into an any repeated google proto type::
Is there a way to encode an repeated any message type?
Can I even use repeated tag with google.protobuf.any?
message Onesensor{
string name=1
string type=2
int32_t reading=3
}
/** Any Message **/
message RepeatedAny{
repeated google.protobuf.any sensors = 1;
}
I am looking for an example, currently using nanopb to encode.

Sure, it is just a regular message.
https://github.com/nanopb/nanopb/tree/master/tests/any_type shows how to encode a single Any message, encoding many is like encoding any array. You'll have a choice between allocating statically, allocating dynamically or using callbacks. Or you can just encode a single subfield at a time into output stream, because concatenating encoded concatenates arrays in protobuf format.

I think I found my issue, I cannot use(repeated tag on the google.protobuf.any, as I would like to append the RepeatedAny messages in the final binary):
message Onesensor{
string name=1
string type=2
int32_t reading=3
}
message RepeatedAny{
repeated google.protobuf.any sensors = 1;
}
Instead I should use something like this:
message Onesensor{
string name=1
string type=2
int32_t reading=3
}
message SensorAny{
google.protobuf.any sensor = 1;
}
message RepeatedAny{
repeated SensorAny sensors = 1;
}
I should not use the repeated tag on the google.protobuf.any, I should be using it on a message that contains the google.protobuf.any instead, so that the protobinary can contain the format (sensors1), (sensors2).....(sensorsN), one or more SensorAny messages.
Below is the sample code, if someone finds this question in the future for nanopb:
/* First encode the SensorAny message by setting the value of the first field,
The first field of this message is of type google.protobuf.any, so it should have
1. sensor.type_url
2. sensor.value
*/
void* pBufAny = calloc(1, sBufSize);
pb_ostream_t ostream_any = pb_ostream_from_buffer(pBufAny, sBufSize);
SensorAny SensorAnyProto = SensorAny_init_default;
SensorAnyProto.has_message = true;
SensorAnyProto.sensor.type_url.arg = "type.googleapis.com/SensorAny.proto";
SensorAnyProto.sensor.type_url.funcs.encode = Proto_encode_string;
ProtoEncodeBufferInfo_t BufInfo = {
.Buffer = pBuf, /* I have already filled and encoded Onesensor message previously as pBuf */
.BufferSize = ostream.bytes_written,
};
SensorAnyProto.sensor.value.funcs.encode = Proto_encode_buffer;
SensorAnyProto.sensor.value.arg = &BufInfo;
pb_encode(&ostream_any, SensorAny_fields, &SensorAnyProto);
free(pBuf);
// Now Use the above encoded Any message buffer pBufAny to set the first repeated field in RepeatedAny
RepeatedAny SensorAnyRepeated = RepeatedAny_init_default;
ProtoEncodeBufferInfo_t AnyBufInfo = {
.Buffer = pBufAny,
.BufferSize = ostream_any.bytes_written,
};
AnyRepeated.sensors.arg=&AnyBufInfo;
AnyRepeated.sensors.funcs.encode = Proto_encode_buffer;
void* pBufAnyRepeated = calloc(1, sBufSize);
pb_ostream_t ostream_repeated = pb_ostream_from_buffer(pBufAnyRepeated, sBufSize);
!pb_encode(&ostream_repeated, RepeatedAny_fields, &AnyRepeated);
free(pBufAny);

Oneof kind vs Enum in protobuf

What's the difference between using an Enum and a oneof kind in protobuf3? As far as I can tell, an Enum restricts the field to be one of a predefined set of values, but so does the oneof kind.

Enums are named numbers. You define the names in the enum definition and assign them a value. An enum should always have the value zero it it's set.
enum State {
A = 0;
B = 1;
C = 2;
}
Next you can use this enum in any of your message
message Update {
State currentState = 1;
State previousState = 2;
}
A oneof is something very different. It allows you to send different types but only allocate limited memory for them. This as you can only set one of those types at a time. This is similar to an union in C/C++ or a std::variant as of C++17.
Take this example in which we have a message, a integer and double defined in our oneof.
// The message in our oneof
message someMsg {
// Multiple fields
}
// The message holding our oneof
message msgWithOneof {
oneof theOneof {
someMsg msg = 1;
int32 counter = 2;
double value = 3;
}
// Feel free to add more fields her of before the oneof
}
You are only able to set msg, counter or value at one time. If you set another this will clear the other field.
Assuming an C/C++ implementation the largest field will determine the amount of memory allocated. Say someMsg is the largest, setting a integer or double will not be a problem as they fit in the same space. If you not use a oneof the total memory allocated would be the sum of sizeof(someMsg) + sizeof(int32) + sizeof(double).
There is some overhead to keep track which field has been set. In the google C++ implementation this is a bit in the presence variable. This is similar to fields which are marked optional.

Don't know how to reflection oneof filed

I have a protobuf message defined something like this
message Foo
{
oneof test //oneof field
{
int32 a = 1;
MM b = 2;
}
}
message MM
{
string str =1;
}
how i reflect of oneof filed in protobuf

For the most part, you handle a message with oneof the same way you would a message without:
message Foo
{
int32 a = 1;
MM b = 2;
}
Oneof is largely transparent to reflection, and doesn't affect the wire format. Its effect is that the generated setter code automatically clears other members of a oneof whenever one is set.
Now, if you care about oneofs for some reason, there's Descriptor::oneof_decl that allows you to enumerate them, Descriptor::FindOneofByName, and FieldDescriptor::containing_oneof (if you are working your way from the field up). With OneofDescriptor in hand, you can find its name and enumerate its fields, and that's pretty much it.

Encoding repeated entries in pbtools

I have a protobuf schema with a bunch of repeated structures. Something like
syntax = "proto3";
package My
message TopLevel
{
string swVersion = 3;
string reportMac = 4;
string reportSsid = 6
}
message Temperature
{
required uint64 ts = 1;
required uint32 source = 3;
repeated sint32 readings = 4;
}
message MyMessage
{
required TopLevel topLevel = 1;
repeated Temperature temperature = 2;
}
I compile with pbtools and get the structures and functions for Temperature and readings. However I am having a hard time figuring out how to add "Temperature" entries dynamically.
Or am I out of luck and pbtools requires telling it ahead of time how many entries I have. One problem is data is encoded as it is generated and I do not know how many of what I will have for each report.
I attached the generated code.

pbtools requires the length before adding any items.

How to create object from repeated type protobuf

What I am looking for is a function that returns the message of a repeated field
I know there is Reflection::AddMessage which has the return type that I want but I do not want to add a message, just return an object of that message.
Here is an example of what I am trying to do let's say I have in the .proto file a message:
message Bar{
uint32 t x = 1;
uint64 t y = 2;
}
message Foo{
repeated Bar myMessage = 1;
}
I am using reflection to iterate through the Foo message and I want to be able to do something like this:
Message* Msg = createMessage(refl->FooMsg, FieldDesc)
I know there is also GetRepeatedMessage but that requires index.

First of all when the protobuf compiler generates the code for compiling you get an accessor function in the interface. The are functions mutable_nameOf_message() which returns the entire repeated field which is a std::vector in c++, or mutable_nameOf_message( index ) which gives you the specified element.
Now if you do not want to use Bar then you d'not need too.
message ArrayOfBar
{
repeated Bar arrayOfBar = 0;
message Bar{
uint32 t x = 1;
uint64 t y = 2;
}
}
If thats what you have hade in mind you could also be do something like this.
std::vector<Bar> arrayOfBars;
But that idea needs refinement because of the internal specifics of the Protobuf. Some unwanted behavior might occur with something like that.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Protobuf use repeated oneof message vs many empty fields - protocol-buffers

Related

How to encode a repeated google.protobuf.any?

Oneof kind vs Enum in protobuf

Don't know how to reflection oneof filed

Encoding repeated entries in pbtools

How to create object from repeated type protobuf

Categories

Resources