Am I able to use
message Foo {
map<string, string> foo = 1;
}
in place of
message Foo {
repeated KeyValuePair foo = 1;
}
message KeyValuePair {
string key = 1;
string value = 2;
}
?
The first source is in proto3 and the second is in proto2.
As long as you don't have duplicate keys, they will be very similar. If you have duplicate keys, using a map will have different behaviour, as duplicatws will either cause overwrites or an exception (I can't recall which, sorry). Also, "repeated" is usually implemented as a list/array/etc, so: order is retained. "map" is usually implemented with some kind of map/dictionary structure, where order is not usually guaranteed.
So: if order doesn't matter and you always have unique keys: you're fine.
Related
Is it (practically) possible to change the type name of a protobuf message type (or enum) without breaking communications?
Obviously the using code would need to be adpated to re-compile. The question is if old clients that use the same structure, but the old names, would continue to work?
Example, base on the real file:
test.proto:
syntax = "proto3";
package test;
// ...
message TestMsgA {
message TestMsgB { // should be called TestMsgZZZ going forward
// ...
enum TestMsgBEnum { // should be called TestMsgZZZEnum going forward
// ...
}
TestMsgBEnum foo = 1;
// ...
}
repeated TestMsgB bar = 1;
// ...
}
Does the on-the-wire format of the protobuf payload change in any way if type or enum names are changed?
If you're talking about the binary format, then no: names don't matter and will not impact your ability to load data; For enums, only the integer value is stored in the payload. For fields, only the field-number is stored.
Obviously if you swap two names, confusion could happen, but: it should load as long as the structure matches.
If you're talking about the JSON format, then it may matter.
I am using protocol buffers defined like this:
message Index {
message albums {
repeated string name = 1;
}
map<string, albums> artists_albums= 1;
map<int32, albums> year_albums = 2;
}
It generates go code like this:
type Index struct {
ArtistsAlbums map[string]*IndexAlbums
YearAlbums map[int32]*IndexAlbums
}
How can I make it generate map values of type IndexAlbums instead of *IndexAlbums?
If you use gogoprotobuf then there is an extension that will allow that
map<string, albums> artists_albums = 1 [(gogoproto.nullable) = false];
With regular goprotobuf I don't believe there is a way.
nullable, if false, a field is generated without a pointer (see warning below).
Warning about nullable: According to the Protocol
Buffer specification, you should be able to tell whether a field is
set or unset. With the option nullable=false this feature is lost,
since your non-nullable fields will always be set. It can be seen as a
layer on top of Protocol Buffers, where before and after marshalling
all non-nullable fields are set and they cannot be unset.
I am trying to create a method that will take a list of items with set weights and choose 1 at random. My solution was to use a Hashmap that will use Integer as a weight to randomly select 1 of the Keys from the Hashmap. The keys of the HashMap can be a mix of Object types and I want to return 1 of the selected keys.
However, I would like to avoid returning a null value on top of avoiding mutation. Yes, I know this is Java, but there are more elegant ways to write Java and hoping to solve this problem as it stands.
public <T> T getRandomValue(HashMap<?, Integer> VALUES) {
final int SIZE = VALUES.values().stream().reduce(0, (a, b) -> a + b);
final int RAND_SELECTION = ThreadLocalRandom.current().nextInt(SIZE) + 1;
int currentWeightSum = 0;
for (Map.Entry<?, Integer> entry : VALUES.entrySet()) {
if (RAND_SELECTION > currentWeightSum && RAND_SELECTION <= (currentWeightSum + entry.getValue())) {
return (T) entry.getKey();
} else {
currentWeightSum += entry.getValue();
}
}
return null;
}
Since the code after the loop should never be reached under normal circumstances, you should indeed not write something like return null at this point, but rather throw an exception, so that irregular conditions can be spotted right at this point, instead of forcing the caller to eventually debug a NullPointerException, perhaps occurring at an entirely different place.
public static <T> T getRandomValue(Map<T, Integer> values) {
if(values.isEmpty())
throw new NoSuchElementException();
final int totalSize = values.values().stream().mapToInt(Integer::intValue).sum();
if(totalSize<=0)
throw new IllegalArgumentException("sum of weights is "+totalSize);
final int threshold = ThreadLocalRandom.current().nextInt(totalSize) + 1;
int currentWeightSum = 0;
for (Map.Entry<T, Integer> entry : values.entrySet()) {
currentWeightSum += entry.getValue();
if(threshold <= currentWeightSum) {
return entry.getKey();
}
}
// if we reach this point, the map's content must have been changed in-between
throw new ConcurrentModificationException();
}
Note that the code fixes some other issues of your code. You should not promise to return an arbitrary T without knowing the actual type of the map. If the map contains objects of different type as key, i.e. is a Map<Object,Integer>, the caller can’t expect to get anything more specific than Object. Besides that, you should not insist of the parameter to be a HashMap when any Map is sufficient. Further, I changed the variable names to adhere to Java’s naming convention and simplified the loop’s body.
If you want to support empty maps as legal input, changing the return type to Optional<T> would be the best solution, returning an empty optional for empty maps and an optional containing the value otherwise (this would disallow null keys). Still, the supposed-to-be-unreachable code point after the loop should be flagged with an exception.
I've run into a use case where I'd like to move an enum declared inside a protocol buffer message to outside the message so that other messages van use the same Enum.
ie, I'm wondering if there are any issues moving from this
message Message {
enum Enum {
VALUE1 = 1;
VALUE2 = 2;
}
optional Enum enum_value = 1;
}
to this
enum Enum {
VALUE1 = 1;
VALUE2 = 2;
}
message Message {
optional Enum enum_value = 1;
}
Would this cause any issues de-serializing data created with the first protocol buffer definition into the second?
It doesn't change the serialization data at all - the location / name of the enums are irrelevant for the actual data, since it just stores the integer value.
What might change is how some languages consume the enum, i.e. how they qualify it. Is it X.Y.Foo, X.Foo, or just Foo. Note that since enums follow C++ naming/scoping rules, some things (such as conflicts) aren't an issue: but it may impact some languages as consumers.
So: if you're the only consumer of the .proto, you're absolutely fine here. If you have shared the .proto with other people, it may be problematic to change it unless they are happy to update their code to match any new qualification requirements.
I'm wondering if it is possible to use Google Protocol Buffers' enum constants as a field number of other messages, like
enum Code {
FOO = 100;
BAR = 101;
}
message Message {
required string foo = FOO;
}
This code doesn't work because FOO's type is enum Code and only a number can be used as a field number.
I am trying to build polymorphic message definitions like this animal example, that defines Cat = 1; in enum Type and required Cat animal = 100; as a unique extension number.
I thought it'd be nice to do
message Message {
required string foo = FOO.value;
}
, so that I can ensure the uniqueness of the extension field number without introducing another magic number.
So the question: is it possible to refer an enum's integer value in the protocol buffer language?
No, there is no way to do this. Sorry.
BTW, two enumerants of the same enum type can actually have the same numeric value, so defining these values in an enum does not actually ensure uniqueness.