ProtoParquetWriter don't write falses, 0s and empty strings - protocol-buffers

In the following example:
try (ParquetWriter<Example> writer =
new ProtoParquetWriter<>(
new Path("file:/tmp/foo.parquet"),
Example.class,
SNAPPY,
DEFAULT_BLOCK_SIZE,
DEFAULT_PAGE_SIZE)) {
writer.write(
Example.newBuilder()
.setTs(System.currentTimeMillis())
.setTenantId("tenant")
.setSomeFlag(false)
.setSomeInt(1)
.setOtherInt(0)
.build());
}
}
And example .proto file:
syntax = "proto3";
package com.example;
message Example {
uint64 ts = 1;
string tenantId = 2;
bool someFlag = 3;
int32 someInt = 4;
int32 otherInt = 2;
}
The resulting parquet file won't have the fields someFlag and otherInt because they are false and 0 respectively.
Is there a way to make it write it anyway or should I handle this on the reader side?

In proto3, presence tracking was not enabled historically, and the only presence rule was around zero defaults. Fortunately this changed recently in new versions of protoc. The optional keyword can now be used in from of fields in proto3 to enable this. So: add optional, and any compliant implementation should do what you want. The defaults are still zero/false/etc, but if they are explicitly set: they are serialized.
syntax = "proto3";
package com.example;
message Example {
optional uint64 ts = 1;
optional string tenantId = 2;
optional bool someFlag = 3;
optional int32 someInt = 4;
optional int32 otherInt = 2; // [sic]
}
Also, the second 2 should be a 5

Related

Convert golang protobuf enum from int32 to string

I am facing with the problem that protobuf I defined enum but its value is int32
Now I want someway or somehow to change all the protobuf defined to string
Or any code-hack for doing it in gateway without changing the protobuf.
Enum defined
enum TimeUnit {
seconds = 0;
minutes = 1;
hours = 2;
days = 3;
months = 4;
}
message CacheDuration {
uint32 Value = 1;
TimeUnit Units = 2;
}
What i got from generated code now is
And it is the return value for front end to use. So they would see the value of Units = int32 like this:
The services communicate by generated struct protobuf.
I want to make it change to
"Units":"days"
Thanks
You can use String method in your go code:
generatedTimeUnitEnum.String() // output: days

What the difference between google.protobuf.Any and google.protobuf.Value?

I want th serialize int/int64/double/float/uint32/uint64 into protobuf, which one should I use ? which one is more effective ?
For example :
message Test {
google.protobuf.Any any = 1; // solution 1
google.protobuf.Value value = 2; // solution 2
};
message Test { // solution 3
oneof Data {
uint32 int_value = 1;
double double_value = 2;
bytes string_value = 3;
...
};
};
In your case, you'd better use oneof.
You can not pack from or unpack to a built-in type, e.g. double, int32, int64, to google.protobuf.Any. Instead, you can only pack from or unpack to a message, i.e. a class derived from google::protobuf::Message.
google.protobuf.Value, in fact, is a wrapper on oneof:
message Value {
// The kind of value.
oneof kind {
// Represents a null value.
NullValue null_value = 1;
// Represents a double value.
double number_value = 2;
// Represents a string value.
string string_value = 3;
// Represents a boolean value.
bool bool_value = 4;
// Represents a structured value.
Struct struct_value = 5;
// Represents a repeated `Value`.
ListValue list_value = 6;
}
}
Also from the definition of google.protobuf.Value, you can see, that there's no int32, int64, or unint64 fields, but only a double field. IMHO (correct me, if I'm wrong), you might lose precision if the the integer is very large. Normally, google.protobuf.Value is used with google.protobuf.Struct. Check google/protobuf/struct.proto for detail.

how to replace proto2 extension with proto3 any when extend different number of field?

I'm trying to learn proto3, and have some questions with any.
I use extension quite much, if my proto is like this:
message base {
extensions 1 to 100;
}
// a.proto
extend base {
optional int32 a = 1;
optional int32 b = 2;
}
// b.proto
extend base {
optional string c = 1;
optional string d = 2;
optional string e = 3;
optional string f = 4;
}
then how to replace these extensions with any ? should i must write like
import google/protobuf/any.proto
message base {
any a = 1;
any b = 2;
any c = 3;
any d = 4;
}
?
may so many proto has extended base.proto and I cannot make sure the max extension number of these protos. then how can I replace these extensions with any?
If I have to write any from 1 to 100 in message base ... oh, that will be too terrible !
You would typically structure it like this:
message base {
any submsg = 1;
}
// a.proto
message submsg_a {
optional int32 a = 1;
optional int32 b = 2;
}
// b.proto
message submsg_b {
optional string c = 1;
optional string d = 2;
optional string e = 3;
optional string f = 4;
}
And then put either submsg_a or submsg_b inside the any field.

Error while importing another proto file

I get an error when I try to compile a proto file to convert to .java.
Could you point out what I'm missing ?
protoc --proto_path=src\main\resources\proto --java_out=src\main\java src\main\resources\proto\PayloadProtocol.proto
PayloadProtocol.proto:32:14: "DataContainer" is not defined.
PayloadProtocol.proto: warning: Import BackendCommunicationService.proto but not used.
Payload.proto
import "BackendCommunicationService.proto";
package com.fleetboard.tp.payload.protocol.protobuf;
option java_package = "com.fleetboard.tp.proto.protocol";
message TPMessage {
required int32 serviceId = 1; // telematic service (TS) id, who owns this message
required int32 functionId = 2; // function id refers to the Java class for the payload
optional uint64 requestId = 3; // Identifier to associate the request to a response
optional TPPayload payload = 4; // serialized representation of a TP message
optional uint64 durability = 5; // life time of message - used from backend
optional DataContainer dataPayload = 6;**
}
BackendCommunicationService.proto
package com.fleetboard.tp.backend.protobuf;
option java_package = "com.fleetboard.tp.proto.backend";
message DataContainer {
required DeviceApplication application = 1; // The container's recipient (MT) or sender (MO)
required string fileName = 2; // File name (no path), length up to 255
required uint64 fileTime = 3; // File time as ms since 1970-01-01 00:00 UTC
}
Fully qualify the name in the importing file:
com.fleetboard.tp.backend.protobuf.DataContainer
or
.com.fleetboard.tp.backend.protobuf.DataContainer
(the . ensures it starts at the root)
You could also try using just the disjoint part, but I don't know if it will work:
backend.protobuf.DataContainer
(since both have the com.fleetboard.tp. prefix)

How to write an inline array of protobuf

I use Google's ProtoBuf and I set lots of value like the following:
optional string force_sampling = 1;
optional string status = 2;
optional string host = 3;
optional string server_addr = 4;
optional string server_port = 5;
optional string client_addr = 6;
optional string request = 7;
optional string msec = 8;
optional string request_time = 9;
optional string logid = 10;
optional string request_body = 11;
optional string response_body = 12;
optional string other = 100;
So, when I set a value to a message, I write many constructions like the following:
set_logid(); set_request_body(); set_other(); set_request_body(); etc.
Can I have an easier way for doing that?
For example, something like:
array way={"set_logid","set_other"}
for (;i = 0;i < len)
{
sample.way[i]()
}
By the way, set_logid is inline
You can use the Message::GetReflection() function and use it to access the fields by name given in a string.
The documentation is here:
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.message#Reflection
However, this will turn out to be slower and more complex, so it might not be worth it.

Resources