How to accept and validate this map in gRPC protoc? - go

I'm sending the following POST request to my gRPC application:
curl \
--request POST \
--header 'Content-Type: application/json' \
--data-raw '{
"mandatory-key1": "value1",
"mandatory-key2": {
"arbitrary-optional-key1": [
"b",
"c"
],
"arbitrary-optional-key2": [
"e"
]
}
}' \
'http://localhost:11000/MyEndpoint'
The value associated with mandatory-key-1 must be a non-empty string.
The value associated with mandatory-key-2 must be a map where all keys are strings and all values are lists of strings.
Now I have to model this request's data structure in the gRPC proto file.
I am thinking of doing something like this:
message MyRequestData {
// pairs represents that map that the user will send in to the MyEndpoint.
map<string, string> pairs = 1;
}
But this specification is not general enough. I need to know how to write this specification correctly.
Question 1: How can I write this specification so it accepts strings in the values and also lists of strings?
Question 2: How can I do validation such that I ensure pairs has keys mandatory-key1 and mandatory-key2 and nothing else?
Question 3: How can I do validation such that I ensure:
pairs has keys mandatory-key1 and mandatory-key2 and nothing else?
pairs[mandatory-key1"] has value which is a non-empty string?
pairs["mandatory-key2"] has value which is a map of <strings, list of non-empty strings>?

Protobufs don't provide (the) validation (that you need).
You'd need to code your validation when you use the protoc-generated sources.
Protobuf doesn't support repeated map values directly, but you can:
message Request {
string mandatory_key1 = 1;
map<string, Value> mandatory_key2 = 2;
}
message Value {
repeated string value = 1;
}

Related

Why doesn't FastAPI handle types derived from int and Enum correctly?

With the following FastAPI backend:
from enum import Enum
from fastapi import FastAPI
class MyNumber(int, Enum):
ONE = 1
TWO = 2
THREE = 3
app = FastAPI()
#app.get("/add/{a}/{b}")
async def get_model(a: MyNumber, b: MyNumber):
return {"sum": a + b}
When a GET operation is done:
curl -X 'GET' \
'http://127.0.0.1:8000/add/2/3' \
-H 'accept: application/json'
Returns the following:
{
"detail": [
{
"loc": [
"path",
"a"
],
"msg": "value is not a valid enumeration member; permitted: 1, 2, 3",
"type": "type_error.enum",
"ctx": {
"enum_values": [
1,
2,
3
]
}
},
{
"loc": [
"path",
"b"
],
"msg": "value is not a valid enumeration member; permitted: 1, 2, 3",
"type": "type_error.enum",
"ctx": {
"enum_values": [
1,
2,
3
]
}
}
]
}
Why is this the case? Even the Swagger UI does recognize the possible values as integers:
I have tried the solution of using IntEnum instead (source), and I can confirm that it works, but still - why does it have to be this way?
The enum.py source code defines IntEnum as:
class IntEnum(int, Enum):
"""Enum where members are also (and must be) ints"""
Did a little digging, based on the comments below OP. To start of with why this is not working as expected: it is how Starlette handles path parameters. When a request is handled by Starlette (or to be precise, when the Router object is called from which the APIRouter inherits), it determines the path_params as a dictionary with keys are parameter names and values as its value. However, when this happens and you haven't specified a type in the path string, it automatically treats the path parameters as string. It then adds this to the Request, basically as a dict {"a":"1", "b":"2"}. Then, further up the call stack, FastAPI is then trying to find the enum value corresponding to "1", which throws throws the validation error because "1" is not 1. Note that your MyNumber class does not play any role here yet.
This behaviour is by design and can be influenced like so:
#app.get("/add/{a:int}/{b:int}")
This will make sure that Starlette will create a path_params dictionary like {"a":1, "b":2}. Note that 1 and 2 are now integers and will be handled as such by FastAPI.
As to why MyNumber(IntEnum) works out of the box while MyNumber(int, Enum) does not, that is a Pydantic implementation. I can't really seem to pinpoint how exactly, but when the ModelField is created for both parameters (which happens in utils.py -> create_response_field() upon application startup), the list of validators is including an int validator when using IntEnum.
But because ModelField is created using functools.partial() I cannot follow the call stack into Pydantic. So, I am unsure why this behaves like this.
So, in short, there are two options to fix this: either force Starlette to parse the path paramaters a and b as int (using {a:int}) or inherit from IntEnum in your own enum class.

Get Google Drive item id of file with known path

A file is located in a known path on Google Drive, for example:
/root/Myfiles/test.txt
How can I get the item-id of the file using the Google Drive V3 REST API (https://www.googleapis.com/drive/v3/files/)? In detail, I am not sure how to construct the query paramer q= for this.
Regards,
Unless you have the file id of MyFiles then your going to have to do this in two calls.
The first thing we will do is list all the directories in root.
This can be done using the Q parameter as you already know
By passing parents in 'root' and mimeType = 'application/vnd.google-apps.folder' and name ='Myfiles' I tell it that I am looking for a folder called Myfiles that has a parent folder of root.
curl \
'https://www.googleapis.com/drive/v3/files?q=parents%20in%20%27root%27%20and%20mimeType%20%3D%20%27application%2Fvnd.google-apps.folder%27%20and%20name%20%3D%27YouTube%27&key=[YOUR_API_KEY]' \
--header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
--header 'Accept: application/json' \
--compressed
The response from this will then look something like this
{
"kind": "drive#fileList",
"incompleteSearch": false,
"files": [
{
"kind": "drive#file",
"id": "1R_QjyKyvET838G6loFSRu27C-3ASMJJa",
"name": "Myfiles",
"mimeType": "application/vnd.google-apps.folder"
}
]
}
I know know the file id of the folder called Myfiles
Now i can do another call which i request a file within that directory id with the name of test.txt like this parents in '1R_QjyKyvET838G6loFSRu27C-3ASMJJa' and name = 'test.txt'
The code will then look something like this
curl \
'https://www.googleapis.com/drive/v3/files?q=parents%20in%20%271R_QjyKyvET838G6loFSRu27C-3ASMJJa%27%20and%20name%20%3D%20%27test.txt%27&key=[YOUR_API_KEY]' \
--header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
--header 'Accept: application/json' \
--compressed
The response
{
"kind": "drive#fileList",
"incompleteSearch": false,
"files": [
{
"kind": "drive#file",
"id": "1_BgrWKsjnZvayvr2kbdHzSzE3K2tNsWhntBsQwfrDOw",
"name": "test.txt",
"mimeType": "application/vnd.google-apps.document"
}
]
}
Summary
As #DalmTo said If you want to search for files within a specific folder you need to have that ID to search within it.
parents in Whether the parent’s collection contains the specified ID.
Which means that you should do two separate queries. One asking for the id of your folder and another looking for the file test.txt in that folder.
q: parents in "root" and mimeType = "application/vnd.google-apps.folder" and name = "Myfiles"
q: parents in "ID_FOLDER" and mimeType = "text/plain" and name = "test"
Example:
If you only have one file in your entire Drive that meets the required characteristics, you could do it in a single query:
q: name = "test" and mimeType = "text/plain"
Caution: If you have uploaded the file, Drive may have detected it as: application/octet-stream. Normally .txt files are detected as plain/text, for more information on MIME types and Drive API, you can check here for common MIME types and here for Drive specific types.
Alternative using Google Apps Script
Here is an example using Google Apps Script:
function findFile() {
var folderId;
var folderQuery = '"root" in parents and title = "Myfiles" and mimeType = "application/vnd.google-apps.folder"'
let folder = Drive.Files.list({
q: folderQuery
})
folderId = folder.items[0].id
let fileQuery = `parents in "${folderId}" and title = "test"`
var file = Drive.Files.list({
q: fileQuery
})
return file.items[0].id
}
Caution: Google Apps Script uses Drive API v2, in this case the query_term name becomes title
More Information
For a deeper understanding of how the Drive API works you can check Search for files guide:
A query string contains the following three parts:
query_term operator values
query_term is the query term or field to search upon.
operator specifies the condition for the query term.
values are the specific values you want to use to filter your search results
To keep in mind when used outside of a client library:
Note: These examples use the unencoded q parameter, where name = 'hello' is encoded as name+%3d+%27hello%27. Client libraries handle this encoding automatically.

How do I specify the --form field in Ruby code (for an HTTP GET request)?

I know that in a curl command, there is an option to specify a --form like this:
-F 'ns=com.my-organization.canvas-app'
I want to know how I can convert this to Ruby code when making an HTTP GET request.
I've managed to figure out how to specify the --data field in Ruby code. So if the --data field of a curl command looks like this:
-d property1=value1 -d property2=value2
Then the data field in the corresponding Ruby code should look something like this:
data = {property1: "value1", property2: "value2"}
But now I'm trying to understand how I can convert this:
-F 'ns=com.my-organization.canvas-app'
into the corresponding Ruby code.
I'm also using HTTParty for the HTTP requests.
As of now, this is the code I have for the GET request:
form={ns: "com.my-organization.canvas-app"}
getResponse = HTTParty.get(base_url,:body => form.json, :headers => $header)
puts getResponse.body
As you can see, I specified the --form in a variable called "form" and used it as an argument to the HTTParty.get() call. Am I doing this correctly?

Ruby to_yaml stringifies my json

I am trying to convert a ruby hash to yaml. I'd like part of the hash be valid json; however, when I try to serialize the json string, it is converted to yaml in quotes.
For example, when I just have a simple string, the ouput is as follows (note foo is not in quotations):
request = {}
request['body'] = 'foo'
request.to_yaml # outputs: body: foo
However, when I add something to the beginning of the string, like { foo the output for body gets quoted:
request['body'] = '{ foo'
request.to_yaml # outputs: body: '{ foo'
How can I get around this? I've tried JSON.parse and, though that make work, I can't be guaranteed that this input will actually be json (could be xml, etc...) -- I just want to give back whatever was given to me but not "stringified".
Basically, I want to give an object that looks like:
{ 'request' => {
'url' => '/posts',
'method' => 'GET',
'headers' => [
'Content-Type' => 'application/json'
]
},
'response' => {
'code' => 200,
'body' => '[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]'
}
}
Which returns:
request:
url: /posts
method: GET
headers:
- Content-Type: application/json
response:
code: 200
body:
[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]
The reason being: right now, I can go from yaml to the correct ruby hash but I can't go the other way.
The method my_hash.to_yaml() just takes a hash and converts it to YAML without doing anything special to the values. The method does not care whether your string is JSON or XML, it just treats it as a string.
So why is my JSON being put into quotes when other strings aren't?
Good question! The reason is simple: curly braces are a valid part of YAML syntax.
This:
my_key: { sub: 1, keys: 2}
Is called flow mapping syntax in YAML, and it allows you make nested mappings in one line. To escape strings which have curly braces in them, YAML uses quotes:
my_key: "{ sub: 1, keys: 2}" # this is just a string
Of course, the quotes are optional for all strings:
my_key: "foo" #same as my_key: foo
Okay, but I want to_yaml() to find my JSON string and convert it to YAML mappings like the rest of the hash.
Well then, you need to convert your JSON string to a hash like the rest of your hash. to_yaml() converts a hash to YAML. It doesn't convert strings to YAML. The proper method for doing this is to use JSON.parse, as you mentioned:
request['body'] = JSON.parse( '{"id":"ef4b3a"}' )
But the string might not be JSON! It might be XML or some other smelly string.
This is exactly why to_yaml() doesn't convert strings. A wise programmer once told me: "Strings are strings. Strings are not data structures. Strings are strings."
If you want to convert a string into a data structure, you need to validate it and parse it. Because there's no guarantee that a string will be valid, it's your responsibility as a programmer to determine whether your data is JSON or XML or just bad, and to decide how you want to respond to each bit of data.
Since it looks like you're parsing web pages, you might want to consider using the same bit of data other web clients use to parse these things:
{ 'request' => {
'url' => '/posts',
'method' => 'GET',
'headers' => [
'Content-Type' => 'application/json' #<== this guy, right here!
]
},
'response' => {
'code' => 200,
'body' => '[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]'
}
}
If the content-type doesn't agree with the body then you should throw an error because your input data is bad.
The reason '{ foo' requires quote is because this is part of the YAML specification 7.3.3 Plain Style.
Excerpt
Plain scalars must never contain the “: ” and “#” character combinations. Such combinations would cause ambiguity with mapping key: value pairs and comments. In addition, inside flow collections, or when used as implicit keys, plain scalars must
not contain the “[”, “]”, “{”, “}” and “,” characters. These characters would cause ambiguity with flow collection structures.
Based on the above even your stated "return" value is incorrect and the body is probably enclosed in single quotes e.g.
response:
code: 200
body: '[{"id":"ef4b3a","title":"this is the title"},{"id":"a98c4f","title":"title of the second post"}]'
Otherwise it would create ambiguity with "Flow Sequences" ([,]) and "Flow Mappings" ({,}).
If you would like result of the JSON, XML or other notation language to be represented appropriately (read objectively) then you will need to determine the correct parser (may be from the "Content-Type") and parse it before converting it YAML

CouchDB Filtered Replication

Trying out filters for replication, I stumbled upon a problem.
While my filter is working as an entry in the _replicator database, I doesn't when using cURL.
The filter in the design document is:
{
"_id": "_design/partial",
"filters": {
"mobile": "function(doc, req) {
if (doc._attachments) {
var result = new Boolean(true);
for (attachment in doc._attachments) {
if (attachment.content_type == 'image/jpeg') {
return true;
}
if (doc._attachments.length > 1024) {
result = false;
}
}
return result;
} else {
return true;
}
}"
}
}
The cURL line:
curl -X POST http://admin:pass#192.168.178.13:5985/_replicate -d '{\"source\":\"http://admin:pass#192.168.2:5984/docs2\",\"target\":\"docs2_partial\",\"filter\":\"partial/mobile\",\"create_target\":true}' -H "Content-Type: application/json"
I created _design/partial document on both target and source, but all documents are being replicated. Even the one with an attached binary bigger than 1 MB.
Any help is appreciated!
The cURL reply is:
{"ok":true,"session_id":"833ff96d21278a24532d116f57c45f31","source_last_seq":32,"replication_id_version":2,"history":[{"session_id":"833ff96d21278a24532d116f57c45f31","start_time":"Wed, 17 Aug 2011 21:43:46 GMT","end_time":"Wed, 17 Aug 2011 21:44:22 GMT","start_last_seq":0,"end_last_seq":32,"recorded_seq":32,"missing_checked":0,"missing_found":28,"docs_read":28,"docs_written":28,"doc_write_failures":0}]}
Using either " instead of \" or " instead of ' the result is:
{"error":"bad_request","reason":"invalid UTF-8 JSON: [...]}
Now I think perhaps the logic of your filter function simply has a bug. Here is how I read your filter policy:
All docs that have no attachments pass
All docs that have an image/jpeg attachment pass
Docs with more than 1,024 attachments fail
In any other case, the docs pass
That sounds like perhaps an incorrect policy. Another way to restate this policy is "Docs with more than 1024 attachments fail, everything else passes." However since you wrote so much code, I suspect my summary is not the true policy.
Another quick note, on what looks like a bug. Given:
for (attachment in doc._attachments) { /* ... */ }
The attachment variable will be things like "index.html" or "me.jpeg", i.e. filenames. To get the attachment content-type, you need:
var type;
// This is WRONG
type = attachment.content_type; // type set to undefined
// This is RIGHT
type = doc._attachments[attachment].content_type; // type set to "text/html" etc.
To avoid this bug, you could change your code to make things more clear:
for (attachment_filename in doc._attachments) { /* ... */ }
Next, doc._attachments.length will tell you the number of attachments in the document, not for example the length of the current attachment. It is odd that you test for that inside the loop, because the expression will never change. Are you trying to test for attachment size instead?
What is the output from curl (i.e. from CouchDB)?
From your example, my first guess is that you have a quoting error. Inside single-quotes, you do not need to escape the double-quotes. Try removing all those backslashes. What happens?
If you are on Windows, the single quote is not valid in the shell. In that case, keep the backslashes and just change the single-quote to a double-quote.

Resources