Protobuf syntax with colons - syntax

Here are three examples of protobuf syntax.
mscoco_label_map.pbtxt
item {
name: "/m/01g317"
id: 1
display_name: "person"
}
item {
name: "/m/0199g"
id: 2
display_name: "bicycle"
}
...
en.wikipedia.org
anotherfield {
foo: 123
bar: 456
}
anotherfield {
foo: 222
bar: 333
}
Official documentation
syntax = "proto3";
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
}
Official (#3) example is obviously differs from #1 and #2. Did I miss a paragraph in official documentation which states that colon can be used instead of equal sign?
Official documentation describes a JSON Mapping, but there is not a single example that looks like #1 and #2. Also #1 and #2 is not a valid JSON either (missing quotes around keys, missing commas).
Q: where are #1 and #2 syntax came from?
Link to better (than official docs) syntax description is appreciated.

Thanks to Marc's Gravell response I was able to find out the answer.
#3 is a schema, proto syntax, think XSD (XML Schema Definition).
#1, #2 is a text dump of an actual data (payload), textproto syntax, *.pbtxt file extension, think XML or JSON.
Related question: What does the protobuf text format look like?
Related links:
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.text_format
C++ API for printing and parsing protocol messages in a human-readable, text-based format.
https://googleapis.dev/python/protobuf/latest/google/protobuf/text_format.html
Python API
https://developers.google.com/protocol-buffers/docs/reference/proto3-spec
Protocol Buffers Version 3 Language Specification
https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/text_format.cc#L288
Source code (C++) of text format ASCII representation parser.
https://medium.com/#nathantnorth/protocol-buffers-text-format-14e0584f70a5
Protocol Buffers: Text Format (2019-10-11, Nathan North)
There are a couple different ways to structure the data (instead of using brackets, you could use <>) but unfortunately nothing in this realm is terribly well documented so it’s more of a game of try it and see if it works.
https://gist.github.com/henridf/704c1c812f04a502c1c26f77a739090b
Encoding a protobuf with protoc --encode (syntax example).
Clear syntax documentation of protobuf text format is still missing.

Related

Decode Protobuf Text

I have some Protobuf text that I'm receiving via an http response from a website. The text roughly looks like this:
1 {
2: some value
7: {
12: some value
}
8: some value
}
except the content is much larger. I don't want to paste the actual text for security purposes.
Anyways, how can I "decode" this so that I can see the schemas?
At the moment it is impossible to obtain a perfectly accurate schema from a protobuf message.
That being said, you can get semi-close. There are some tools like protobuf-inspector that can print out a bit more information about the structure of the message.
Some important caveats about this tool (and in general) as to why it's not possible to obtain the full schema, taken from the README of the tool:
[...] the field names are obviously lost, together with some high-level details such as:
whether a varint uses zig-zag encoding or not (will assume no zig-zag by default)
whether a 32-bit/64-bit value is an integer or float (both shown by default)
signedness (auto-detect by default)

Using maps in protobuf v2

Currently I am using protobuf version 2 in my project. So far all of the messages are working great; however I have hit a road block trying to use the 'map' keyword.
The TLDR usage behind needing the map, is that I want to pass some JSON key/value pairs to my server to do a lookup, and potential log of data to a server (which uses a JSON message interface).
I am currently using the backwards compatibility method that is recommend on the docs page: https://developers.google.com/protocol-buffers/docs/proto#maps
What I would like to understand is why is the following declaration of my message (using maps) failing to compile? I am using the following veriosn of the protoc compiler: '# protoc --version => libprotoc 2.6.1'
message MapFieldEntry {
optional string key = 1;
optional string value = 2;
}
message Lookup {
repeated MapFieldEntry map_field = 1;
map<string, string> test_map = 2;
}
The error I receive is as follows (the errors don't make sense to me considering the documentation of the map feature):
Expected "required", "optional", or "repeated".
Expected field name.
I have tried adding syntax="proto2"; at the top, but I still get the error.
Edit:
Just as a note; the issue I am having is regarding the second argument of the Lookup message. The first argument is what I am currently using as a work around.
I found someone else with a similar issue on git:
https://github.com/google/protobuf/issues/799
The response is:
The maps syntax is only supported starting from v3.0.0. The "proto2"
in the doc is referring to the syntax version, not protobuf release
version. v3.0.0 supports both proto2 syntax and proto3 syntax while
v2.6.1 only supports proto2 syntax. For all users, it's recommended to
use v3.0.0-beta-1 instead of v2.6.1.
So it looks like to fix your problem, you should use protoc 3, instead of 2.6.1.
And keep your syntax=proto2 at the top of your file to precise this is the proto2 syntax that you use.
Could you try and let me know if this work? this is an interesting question as the official doc does not mention it.

Separate YAML and plain text on the same document

While building a blog using django I realized that it would be extremely practical to store the text of an article and all the related informations (title, author, etc...) together in a human-readable file format, and then charge those files on the database using a simple script.
Now that said, YAML caught my attention for his readability and ease of use, the only downside of the YAML syntax is the indentation:
---
title: Title of the article
author: Somebody
# Other stuffs here ...
text:|
This is the text of the article. I can write whatever I want
but I need to be careful with the indentation...and this is a
bit boring.
---
I believe that's not the best solution (especially if the files are going to be written by casual users). A format like this one could be much better
---
title: Title of the article
author: Somebody
# Other stuffs here ...
---
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
Is there any solution? Preferably using python.
Other file formats propositions are welcome as well!
Unfortunately this is not possible, what one would think could work is using | for a single scalar in the separate document:
import ruamel.yaml
yaml_str = """\
title: Title of the article
author: Somebody
---
|
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
for d in ruamel.yaml.load_all(yaml_str):
print(d)
print('-----')
but it doesn't because | is the block indentation indicator. And although at the top level an indentation of 0 (zero) would easily work, ruamel.yaml (and PyYAML) don't allow this.
It is however easy to parse this yourself, which has the advantage over using the front matter package that you can use YAML 1.2 and are not restricted to using YAML 1.1 because of frontmaker using the PyYAML. Also note that I used the more appropriate end of document marker ... to separate YAML from the markdown:
import ruamel.yaml
combined_str = """\
title: Title of the article
author: Somebody
...
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
with open('test.yaml', 'w') as fp:
fp.write(combined_str)
data = None
lines = []
yaml_str = ""
with open('test.yaml') as fp:
for line in fp:
if data is not None:
lines.append(line)
continue
if line == '...\n':
data = ruamel.yaml.round_trip_load(yaml_str)
continue
yaml_str += line
print(data['author'])
print(lines[2])
which gives:
Somebody
I want...
(the round_trip_load allows dumping with preservation of comments, anchor names etc).
I found Front Matter does exactly what I want to do.
There is also a python package.

Primitive type as data structure for API Blueprint

I want to use primitive type for describe data structure. Like so:
# Data Structures
## Video Delete (enum[number])
+ `0` - Successful deletion.
+ `1` - Error occured.
And the output is.
{
"enum": [
1,
0
],
"$schema": "http://json-schema.org/draft-04/schema#"
}
So description is missing. I've tried to put description in different places. I did a lot of things (do not wanna talk about them). Also I've tried to add info to enum values like so:
+ `0` (number) - Successful deletion.
I do not know whether this problem deals with MSON syntax or Aglio generator.
The syntax above is supported by MSON as far as I can tell. The problem is that Aglio doesn't do anything with the description, and when I went to look into adding it I realized that it isn't really supported in JSON Schema. There seem to be two methods people use to get around that fact:
Add the enumerated value descriptions to the main description, the Olio theme 1.6.2 has support for this but the C++ parser seems to still have some bugs around this feature:
## Video Delete (enum[number]) - 0 for success, 1 for error
Use a weird oneOf syntax where you create sets of single enums with a description. I don't recommend this.
Unfortunately the first option requires work on your part and can't easily be done in Aglio. Does anyone else have a better description and some samples of MSON input -> JSON Schema output?

SuperCollider: convert a Dictionary to YAML

SuperCollider has a String:parseYAML method that can create a nested Dictionary:
"{44: 'woo'}".parseYAML
Dictionary[ (44 -> woo) ]
But how to go the other way, output a YAML string given a (possibly nested) Dictionary?
[answer is from someone else outside]
Does the document have to be readable?
I've ben using JSON.stringify from Felix's API quark In order to share dictionaries with an Max MSP application.
The result from this method is not readable, that is, it doesn't generate any newlines and tabs etc. So it doesn look pretty in a text document, but that's not the intention with method design I can imagine.

Resources