How to limit the length of a JSON string after encoding?

How to limit the length of a JSON string after encoding? - go

I am using go to collect responses from URLs and store them in JSON for later processing. I want to enforce a limit on the size of the responses I store, however I've been unable to enforce this limit properly as when the JSON is marshalled the string is escaped which pushes it's length over the limit.
How do I limit the length of the string including the escape characters when marshaling?

How do I limit the length of the string including the escape characters when marshaling?
You cannot. Dead simple.

Related

How does GitHub encode their graphQL cursors?

GitHub's graphql cursors are intentionally opaque, so they shouldn't ever be decode by a client. However I'd like to know their approach towards pagination, especially when combined with sorting.

There are multiple layers of encoding for the encoding used for pagination cursors used by GitHub. I will list them in order from the perspective of a decoder:
The cursor string is encoded using URL safe base64 meaning it uses - and _ instead of + and /. This might be to have consistency with their REST based API.
Decoding the base64 string gives us another string in the format of cursor:v2:[something] so the next step is decoding the something.
The 'something' is a binary encoded piece of data containing the actual cursor properties. The first byte defines the cursor type:
0x91 => We don't use any sorting, the cursor contains the length of the id field and the id itself. 0xcd seems to indicate a two-byte id, 0xce a four-byte id. This is followed by the id itself, which can be verified by decoding the base64 id graphql field.
0x92 => A composite cursor containing the sorted property and the id. This is either a length-prefixed ordinal number or two bytes plus a string or ISO date string followed by the length-prefixed id.

Size limit for GraphQL scalar String

I want to send images as Base64 encoded strings. However, the size of each generated string turns out to be very large. My question is therefore, does GraphQL restrict the scalar String to a certain length, i.e. is it possible for me to send my images as strings using GraphQL?

As per last available specs, GraphQL does not restrict the scalar String to a certain length, i.e. it's possible for you to send it as a String.

What is the length and characters in ids generated by elasticsearch?

The documentation of elastic search states:
The index operation can be executed without specifying the id. In such
a case, an id will be generated automatically.
But it does not provide any information about the properties of the ids.
What is the length (minimun/maximum)?
my guess is 22.
Which characters are used in the id?
My guess is [-_A-Za-z0-9]
Can the properties of the generated ids change at any time (is that part of the API)?

Auto-generated ids are random base64-encoded UUIDs. The base64 algorithm is used in URL-safe mode hence - and _ characters might be present in ids.

Auto-generated ids by elasticsearch are exactly 20 characters length (not 22 characters) and encoded by url-safe base64 algorithm [-_A-Za-z0-9].
Read more in documentation: https://www.elastic.co/guide/en/elasticsearch/guide/master/index-doc.html#_autogenerating_ids

Compression algorithms for Strings

I have to generate QRCodes using concatenated object properties. These strings might be long, that's why I'd like to know which compression algorithm to use knowing that my String's length is between 25 an 100+ characters
thanks in advance,
Jerec

I am assuming that since you are going to use compression before you store the strings that these QR codes will not be readable by any client, it would have to be an application that you wrote (b/c you are storing character with an unknown encoding, the client won't be able to decode).
Instead of compressing and storing the long string in the QR code, have your application create a URI (like a GUID or a URL) and when your application decodes that URI it looks up all the values (uncompressed) that you wanted to store in the QR code. Then your app can just look up the format in any way it wants.
For example, assuming your persistant storage is an xml file, but it could be anything:
<URI = "http://mydomain.com/790C9704-8C61-435F-991D-CDBB5767AA3D">
<MyElement>14523</MyElement>
<MyElement>67548</MyElement>
...
<MyElement>46167</MyElement>
</URI>
Encoded on QR code: "http://mydomain.com/790C9704-8C61-435F-991D-CDBB5767AA3D", values can then be looked up.

The algorithm used to encode QR codes is dependent on the type of data you encode. See http://www.swetake.com/qr/qr1_en.html.
If you know, for example, that you always have the same number of digits per id and therefor could just string them together without punctuation, you can encode them as purely numeric and you'll use 10 bits for every three characters.
If you need some kind of separator, if you use something in "0-9A-Z $%*+-./:", you'll stay alphanumeric and get 2 characters in 11 bits.
If you give it arbitrary data (note that this includes any lower case: the list above does not include lower case letters) you're going to be using 8 bits per characters.
So numeric only would end up being 60% smaller.

UTF-8 string delimiter

I am parsing a binary protocol which has UTF-8 strings interspersed among raw bytes. This particular protocol prefaces each UTF-8 string with a short (two bytes) indicating the length of the following UTF-8 string. This gives a maximum string length 2^16 > 65 000 which is more than adequate for the particular application.
My question is, is this a standard way of delimiting UTF-8 strings?

I wouldn't call that delimiting, more like "length prefixing". Some people call them Pascal strings since in the early days the language Pascal was one of the popular ones that stored strings that way in memory.
I don't think there's a formal standard specifically for just that, as it's a rather obvious way of storing UTF-8 strings (or any strings of bytes for that matter). It's defined over and over as a part of many standards that deal with messages that contain strings, though.

UTF8 is not normally de-limited, you should be able to spot the multibyte characters in there by using the rules mentioned here: http://en.wikipedia.org/wiki/UTF-8#Description

i would use a delimiter which starts with 0x11......
but if you send raw bytes you will have to exclude this delimiter from the data\messages processed ,this means that if there is a user input similar to that delimiter, you will have to convert it.
if the user inputs any utf8 represented char you may simply send it as is.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to limit the length of a JSON string after encoding? - go

How do I limit the length of the string including the escape characters when marshaling? You cannot. Dead simple.

Related

How does GitHub encode their graphQL cursors?

Size limit for GraphQL scalar String

What is the length and characters in ids generated by elasticsearch?

Compression algorithms for Strings

UTF-8 string delimiter

Categories

Resources