i have a serialized bin file of protobufs, written mainly in protobufs-net.
i want to decompile it, and see the structure of it.
i used some toolds like :
https://protogen.marcgravell.com/decode
and i also used protoc:
protoc --decode_raw < ~/Downloads/file.bin
and this is part of the result i get:
1 {
1: "4f81b7bb-d8bd-e911-9c1f-06ec640006bb"
2: 0x404105b1663ef93a
3: 0x4049c6158c593f36
4: 0x40400000
5 {
1: "53f8afde-04c6-e811-910e-4622e9d1766e"
2 {
1: "e993fba0-8fc9-e811-9c15-06ec640006bb"
}
2 {
1: "9a7c7210-3aca-e811-9c15-06ec640006bb"
2: 1
}
2 {
1: "2d7d12f1-2bc9-e811-9c15-06ec640006bb"
}
3: 18446744073709551615
}
6: 46
7: 1571059279000
}
how i can decompile it? i want to know the structure and change data in it and make a new bin file.
Reverse engineering a .proto file is mostly a case of looking at the output of the tools such as you've mentioned, and trying to write a .proto that looks similar. Unfortunately, a number of concepts are ambiguous if you don't know the schema, as multiple different data types and shapes share the same encoding details, but... we can make guesses.
Looking at your output:
1 {
...
}
tells us that our root message probably has a sub-message at field 1; so:
message Root {
repeated Foo Foos = 1;
}
(I'm guessing at the repeated here; if the 1 only appears once, it could be single)
with everything at the next level being our Foo.
1: "4f81b7bb-d8bd-e911-9c1f-06ec640006bb"
2: 0x404105b1663ef93a
3: 0x4049c6158c593f36
4: 0x40400000
5: { ... }
6: 46,
7: 1571059279000
this looks like it could be
message Foo {
string A = 1;
sfixed64 B = 2;
sfixed64 C = 3;
sfixed32 D = 4;
repeated Bar E = 5; // again, might not be "repeated" - see how many times it occurs
int64 F = 6;
int64 G = 7;
}
however; those sfixed64 could be double, or fixed64; and those sfixed32 could be fixed32 or float; likewise, the int64 could be sint64 or uint64 - or int32, sint32, uint32 or bool, and I wouldn't be able to tell (they are all just "varint"). Each option gives a different meaning to the value!
our Bar definitely has some kind of repeated, because of all the 2:
1: "53f8afde-04c6-e811-910e-4622e9d1766e"
2 { ... }
2 { ... }
2 { ... }
3: 18446744073709551615
let's guess at:
message Bar {
string A = 1;
repeated Blap B = 2;
int64 C = 3;
}
and finally, looking at the 2 from the previous bit, we have:
1: "e993fba0-8fc9-e811-9c15-06ec640006bb"
and
1: "9a7c7210-3aca-e811-9c15-06ec640006bb"
2: 1
and
1: "2d7d12f1-2bc9-e811-9c15-06ec640006bb"
so combining those, we might guess:
message Blap {
string A = 1;
int64 B = 2;
}
Depending on whether you have more data, there may be additional fields, or you may be able to infer more context. For example, if an int64 value such as Blap.B is always 1 or omitted, it might actually be a bool. If one of the repeated elements always has at most one value, it might not be repeated.
The trick is to to play with it until you can deserialize the data, re-serialize it, and get the exact same payload (i.e. round-trip).
Once you have that: you'll want to deserialize it, mutate the thing you wanted to change, and serialize.
Related
I have a message of the following type
message Foo {
string bar = 1;
float baz = 2;
}
Is there any problem in converting it to the following for use in Go ?
message Foo {
string bar = 1;
optional float baz = 2;
}
Is the preferred way to deprecate and create a new field in the proto in this case as well ?
depends on how deeply integrated that particular message is in your code base - meaning
are you storing the marshaled binary representation somewhere like your database
are different parts of your code base using different versions of the message you are modifying - e.g. older versions of your android/ios apps and such
point being if you use a message structure to unmarshal encoded data, that was not generated with the very same message structure, into - bad things will happen.
The docs recommend adding a new element to circumvent such scenarios entirely. If that is not something you want to do, take the above points into consideration.
The optional will make the field a pointer type. So in Go generated code, optional float will become *float32, which of course is not float32.
To deprecate a field, use [deprecated = true] field option:
message Foo {
string bar = 1;
float baz = 2 [deprecated = true];
}
If in subsequent releases of your protobuf schema you actually remove the field altogether from the message, you might want to add reserved 2, where 2 is the number of the field you removed.
message Foo {
string bar = 1;
reserved 2;
}
This helps preventing other people or future you from adding a new field in position 2. This is relevant in case you have outdated clients which still expect a float in position 2.
PS: optional fieds in Proto3 are supported from version 3.15
I suggest you to use the FloatValue type defined in the google.protobuf package. As example:
syntax = "proto3";
import "google/protobuf/wrappers.proto";
message Foo {
string bar = 1;
google.protobuf.FloatValue baz = 2;
}
Will generate a pb files with the content:
type Foo struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
Bar string `protobuf:"bytes,1,opt,name=bar,proto3" json:"bar,omitempty"`
Baz *wrapperspb.FloatValue `protobuf:"bytes,2,opt,name=baz,proto3" json:"baz,omitempty"`
}
You can use as follow:
f := Foo{
Bar: "Bar",
Baz: &wrapperspb.FloatValue{Value: float32(3)},
}
var floatValue float32
if f.Baz != nil {
floatValue = f.Baz.GetValue()
}
I have a protobuf schema with a bunch of repeated structures. Something like
syntax = "proto3";
package My
message TopLevel
{
string swVersion = 3;
string reportMac = 4;
string reportSsid = 6
}
message Temperature
{
required uint64 ts = 1;
required uint32 source = 3;
repeated sint32 readings = 4;
}
message MyMessage
{
required TopLevel topLevel = 1;
repeated Temperature temperature = 2;
}
I compile with pbtools and get the structures and functions for Temperature and readings. However I am having a hard time figuring out how to add "Temperature" entries dynamically.
Or am I out of luck and pbtools requires telling it ahead of time how many entries I have. One problem is data is encoded as it is generated and I do not know how many of what I will have for each report.
I attached the generated code.
pbtools requires the length before adding any items.
In the code below, I want to retain number_list, after iterating over it, since the .into_iter() that for uses by default will consume. Thus, I am assuming that n: &i32 and I can get the value of n by dereferencing.
fn main() {
let number_list = vec![24, 34, 100, 65];
let mut largest = number_list[0];
for n in &number_list {
if *n > largest {
largest = *n;
}
}
println!("{}", largest);
}
It was revealed to me that instead of this, we can use &n as a 'pattern':
fn main() {
let number_list = vec![24, 34, 100, 65];
let mut largest = number_list[0];
for &n in &number_list {
if n > largest {
largest = n;
}
}
println!("{}", largest);
number_list;
}
My confusion (and bear in mind I haven't covered patterns) is that I would expect that since n: &i32, then &n: &&i32 rather than it resolving to the value (if a double ref is even possible). Why does this happen, and does the meaning of & differ depending on context?
It can help to think of a reference as a kind of container. For comparison, consider Option, where we can "unwrap" the value using pattern-matching, for example in an if let statement:
let n = 100;
let opt = Some(n);
if let Some(p) = opt {
// do something with p
}
We call Some and None constructors for Option, because they each produce a value of type Option. In the same way, you can think of & as a constructor for a reference. And the syntax is symmetric:
let n = 100;
let reference = &n;
if let &p = reference {
// do something with p
}
You can use this feature in any place where you are binding a value to a variable, which happens all over the place. For example:
if let, as above
match expressions:
match opt {
Some(1) => { ... },
Some(p) => { ... },
None => { ... },
}
match reference {
&1 => { ... },
&p => { ... },
}
In function arguments:
fn foo(&p: &i32) { ... }
Loops:
for &p in iter_of_i32_refs {
...
}
And probably more.
Note that the last two won't work for Option because they would panic if a None was found instead of a Some, but that can't happen with references because they only have one constructor, &.
does the meaning of & differ depending on context?
Hopefully, if you can interpret & as a constructor instead of an operator, then you'll see that its meaning doesn't change. It's a pretty cool feature of Rust that you can use constructors on the right hand side of an expression for creating values and on the left hand side for taking them apart (destructuring).
As apart from other languages (C++), &n in this case isn't a reference, but pattern matching, which means that this is expecting a reference.
The opposite of this would be ref n which would give you &&i32 as a type.
This is also the case for closures, e.g.
(0..).filter(|&idx| idx < 10)...
Please note, that this will move the variable, e.g. you cannot do this with types, that don't implement the Copy trait.
My confusion (and bear in mind I haven't covered patterns) is that I would expect that since n: &i32, then &n: &&i32 rather than it resolving to the value (if a double ref is even possible). Why does this happen, and does the meaning of & differ depending on context?
When you do pattern matching (for example when you write for &n in &number_list), you're not saying that n is an &i32, instead you are saying that &n (the pattern) is an &i32 (the expression) from which the compiler infers that n is an i32.
Similar things happen for all kinds of pattern, for example when pattern-matching in if let Some (x) = Some (42) { /* … */ } we are saying that Some (x) is Some (42), therefore x is 42.
What I am looking for is a function that returns the message of a repeated field
I know there is Reflection::AddMessage which has the return type that I want but I do not want to add a message, just return an object of that message.
Here is an example of what I am trying to do let's say I have in the .proto file a message:
message Bar{
uint32 t x = 1;
uint64 t y = 2;
}
message Foo{
repeated Bar myMessage = 1;
}
I am using reflection to iterate through the Foo message and I want to be able to do something like this:
Message* Msg = createMessage(refl->FooMsg, FieldDesc)
I know there is also GetRepeatedMessage but that requires index.
First of all when the protobuf compiler generates the code for compiling you get an accessor function in the interface. The are functions mutable_nameOf_message() which returns the entire repeated field which is a std::vector in c++, or mutable_nameOf_message( index ) which gives you the specified element.
Now if you do not want to use Bar then you d'not need too.
message ArrayOfBar
{
repeated Bar arrayOfBar = 0;
message Bar{
uint32 t x = 1;
uint64 t y = 2;
}
}
If thats what you have hade in mind you could also be do something like this.
std::vector<Bar> arrayOfBars;
But that idea needs refinement because of the internal specifics of the Protobuf. Some unwanted behavior might occur with something like that.
Let's imagine that we have privacy options page in social network; two group of radio buttons.
Allow to post on wall p f c (groupA)
Allow to view wall p f c (groupB)
p = public
f = only friends
c = closed
It is obvious that there is a dependency between this groups of checkboxes. For example, we should automatically set groupA=c when groupB=c; viewing wall is closed, so wall comments form should also be closed and so on.
It is possible to solve this problem using numerous if's, but we will have very complex control structure as result.
Any good solution?
Thank you
You have 2 sets of permissions, and 'write' permissions should never be less restrictive than read.
If (0 - no access, 1- limited[friends only], 2 - public access), then after changing value in GroupB validating GroupA value may look like GroupA.value = (GroupA.value <= GroupB.value) ? GroupA.value : GroupB.value. GroupB - read permissions, GroupA - write permissions.
Define one bit-mask for viewing, and another bitmask for posting, with one bit in each for public and friends (closed simply means both bits are set to 0). A bit that's set to 1 allows access, and a bit that's set to 0 denies access.
AND the "post" bitmask with the "view" bitmask to ensure that all the bits that are cleared in the "view" bitmask are also cleared in the "post" bitmask.
In something like C or C++, this would look something like this:
unsigned view;
unsigned post;
enum { friends = 1, public = 2 };
view = friends;
post = friends | public; // create an invalid combination
post &= view; // correct the invalid combination;
You can also define the comparisions in a structure and check every entry in a function.
I mean something like that in C:
#define ACCESS_CLOSED 0
#define ACCESS_FRIEND 1
#define ACCESS_PUBLIC 2
typedef struct dep {
int *master;
int masterval;
int *slave;
int slaveval;
} dep_t;
int checkdeps(dep_t *deps, int n)
{
int i;
for (i=0; i<n; i++) {
if (*(deps[i].master) == deps[i].masterval)
*(deps[i].slave) = deps[i].slaveval;
}
}
int main(void)
{
int groupA = ACCESS_FRIEND;
int groupB = ACCESS_FRIEND;
int groupC = ACCESS_FRIEND;
// if the first argument has the value of the second argument
// then the third is set to the value from the fourth
dep_t deps[] = {
{ &groupB, ACCESS_CLOSED, &groupA, ACCESS_CLOSED },
{ &groupB, ACCESS_FRIEND, &groupC, ACCESS_CLOSED }
};
groupB = ACCESS_CLOSED;
checkdeps(deps, sizeof(deps)/sizeof(dep_t));
printf("A: %d, B: %d, C: %d\n", groupA, groupB, groupC);
return 0;
}