how to parse only one field from protobuf Serialize byte array - protocol-buffers

I need only one field from protobuf serialize byte array, but must call parseFrom function, which is low performance. how can I get field offset from byte array, and parse value by offset.

As mentioned in the Protocol Buffers documentation on fields order:
When a message is serialized, there is no guaranteed order for how its known or unknown fields will be written. Serialization order is an implementation detail, and the details of any particular implementation may change in the future. Therefore, protocol buffer parsers must be able to parse fields in any order.
Therefore you cannot have a 100% reliable offset to a field. You will need to call the ParseFrom function.

Related

Can a proto3 optional field be changed to repeated without breaking wire compatibility?

Let's say that I have a proto3 message defined as follows, for use as a gRPC request (i.e. using protobuf's binary encoding):
message MyRequest {
string name = 1;
}
Can I change my server (i.e. the reader of the message) to use the following definition without breaking wire compatibility for existing clients (i.e. writers)?
message MyRequest {
repeated string names = 1;
}
In the proto2 language guide, I see the following:
optional is compatible with repeated. Given serialized data of a repeated field as input, clients that expect this field to be optional will take the last input value if it's a primitive type field or merge all input elements if it's a message type field.
However, the proto3 documentation does not contain an equivalent statement. I think that this may be related to the use of the packed encoding for repeated fields in proto3.
Yes, this is possible as the binary encoding for an optional string and for a repeated string with a single element are the same. However, this change may be confusing to readers of the code because it is not immediately obvious that a message can be reinterpreted in this way.

Generate unique alpha-numeric object ID, like Parse

I already asked what I needed at the title. I want to generate using either PHP or javascript.
I think the class name and some properties are used to build the objectId but someone may already know how its done that could share here?
The Parse Server generates the objectId. It is a randomly generated String of 10 chars length. You can see their implementation at cryptoUtils.newObjectId(). From the code we can conclude that they are not enforcing uniqueness.
https://github.com/ParsePlatform/parse-server/blob/master/src/cryptoUtils.js
Parse is probably using ids generated in Mongodb. They are not random and can be potentially predicted :
A BSON ObjectID is a 12-byte value
consisting of a 4-byte timestamp
(seconds since epoch), a 3-byte
machine id, a 2-byte process id, and a
3-byte counter
http://www.mongodb.org/display/DOCS/Object+IDs

Can .proto files' fields start at zero?

.proto examples all seem to start numbering their fields at one.
e.g. https://developers.google.com/protocol-buffers/docs/proto#simple
message SearchRequest {
required string query = 1;
optional int32 page_number = 2;
optional int32 result_per_page = 3;
}
If zero can be used, it will make some messages one or more bytes smaller (i.e. those with a one or more field numbers of 16).
As the key is simply a varint encoding of (fieldnum << 3 | fieldtype) I can't immediately see why zero shouldn't be used.
Is there a reason for not starting the field numbering at zero?
One very immediate reason is that zero field numbers are rejected by protoc:
test.proto:2:28: Field numbers must be positive integers.
As to why Protocol Buffers has been designed this way, I can only guess. One nice consequence of this is that a message full of zeros will be detected as invalid. It can also be used to indicate "no field" internally as a return value in protocol buffers implementation.
Assigning Tags
As you can see, each field in the message definition has a unique numbered tag. These tags are used to identify your fields in the message binary format, and should not be changed once your message type is in use. Note that tags with values in the range 1 through 15 take one byte to encode, including the identifying number and the field's type (you can find out more about this in Protocol Buffer Encoding). Tags in the range 16 through 2047 take two bytes. So you should reserve the tags 1 through 15 for very frequently occurring message elements. Remember to leave some room for frequently occurring elements that might be added in the future.
The smallest tag number you can specify is 1, and the largest is 229-1, or 536,870,911. You also cannot use the numbers 19000 through 19999 (FieldDescriptor::kFirstReservedNumber through FieldDescriptor::kLastReservedNumber), as they are reserved for the Protocol Buffers implementation - the protocol buffer compiler will complain if you use one of these reserved numbers in your .proto. Similarly, you cannot use any previously reserved tags.
https://developers.google.com/protocol-buffers/docs/proto
Just like the document says, 0 can't be detected.

For SCSI, was there a change to the definition of ADDITIONAL LENGTH?

I am reading the SCSI SPC4r22. In regards to ADDITIONAL LENGTH all revisions prior to spc3 have stated the following ("shall not be adjusted"):
From spc2r20.pdf:
"The ADDITIONAL LENGTH field shall specify the length in bytes of the parameters. If the ALLOCATION LENGTH of the CDB is too small to transfer all of the parameters, the ADDITIONAL LENGTH shall not be adjusted to reflect the truncation."
But I don't see that statement in SPC3 or SPC4. Has that been changed or am I missing the phrase? If I'm missing it, can someone please quote it?
It is just worded in a more general way:
"If the information being transferred to the Data-In Buffer includes fields containing counts of the number of bytes in some or all of the data, then the contents of these fields shall not
be altered to reflect the truncation"

MAPI: Format of PR_SEARCH_KEY

Does anyone know the format of the MAPI property PR_SEARCH_KEY?
The online documentation has this to say about it:
The search key is formed by
concatenating the address type (in
uppercase characters), the colon
character ':', the e-mail address in
canonical form, and the terminating
null character.
And the exchange document MS-OXOABK says this:
The PidTagSearchKey property of type
PtypBinary is a binary value formed by
concatenating the ASCII string "EX: "
followed by the DN for the object
converted to all upper case, followed
by a zero byte value.
However all the MAPI messages I've seen with this property have it as some sort of binary 16 byte sequence that looks like a GUID. Does anyone else have any more information about it? Is it always 16 bytes?
Thanks!
I believe that the property PR_SEARCH_KEY will be of different formats for different objects (as alluded to by Moishe).
A MAPI message object will have a unique value assigned on creation for PR_SEARCH_KEY, however if the object is copied this property value is copied also. I presume when you reply to an e-mail, Exchange will assign the PR_SEARCH_KEY value to be the original message's value.
You will need to inspect each object type to understand how the PR_SEARCH_KEY is formed but I doubt if it's always 16 bytes for all MAPI types.
This link USENET discussion has a good discussion with Dmitry Streblechenko involved who is an expert on Extended MAPI.
The sentence before the ones you quoted from the online docs reads, "MAPI uses specific rules for constructing search keys for message recipients" which makes me think that it's talking about the PR_SEARCH_KEY property on MAPI_MAILUSER objects -- or at least not on MAPI_MESSAGE objects.

Resources