Go: how to not remove double quotes on cookies - go

Go removes double quotes in cookies. Is there a way to keep double quotes in cookies in Go?
For example, I'm sending a small JSON message and "SetCookie" strips double quote.
w.SetCookie("my_cookie_name", small_json_message)
More about Cookies:
The HTTP RFC defines quoted string values. See https://www.rfc-editor.org/rfc/rfc7230#section-3.2.6
The proposed cookie RFC explicitly says double quotes are allowed in cookie values: cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
Go currently has a condition to insert double quote into cookies, so obviously double quotes are allowed.
; is the cookie delimiter.
Values with ASCII characters outside the limited ASCII range may be quoted (The RFC calls this the quoted_string) which expands the allowed character set.
JSON does not contain the ; character, so for ; to appear in JSON it can only appear in string values. In JSON, string values are already quoted.
I've confirmed testing using a simple k:v JSON payload and it works fine on all major browsers with no issues.
Cookies are typically generated by server data, not user data. This means well structured, not arbitrary, JSON may be used
JSON easily can conform to the cookie RFC. Additionally, even though it's not an issue with this example of JSON, regarding the hypothetical concern of not conforming to the RFC:
A cookie is transmitted as a HTTP headers. Many HTTP headers commonly disregard the RFC. For example, the Sec-Ch-Ua header created by Chrome, includes several "bad" characters.
Sec-Ch-Ua: "Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"
"comma" is "disallowed" and it's used all the time.
Even if double quotes were "wrong", which they are not, but if they were, there are lots of in-the-wild examples of cookies containing quotes.
For reference, here's the relevant section of RFC 6265
set-cookie-header = "Set-Cookie:" SP set-cookie-string
set-cookie-string = cookie-pair *( ";" SP cookie-av )
cookie-pair = cookie-name "=" cookie-value
cookie-name = token
cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
; US-ASCII characters excluding CTLs,
; whitespace DQUOTE, comma, semicolon,
; and backslash
Where one can see whitespace DQUOTE is disallowed and DQUOTE is allowed.

HTTP cookies are allowed to have double quotes.
Are you sure? rfc6265 states:
set-cookie-header = "Set-Cookie:" SP set-cookie-string
set-cookie-string = cookie-pair *( ";" SP cookie-av )
cookie-pair = cookie-name "=" cookie-value
cookie-name = token
cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
; US-ASCII characters excluding CTLs,
; whitespace DQUOTE, comma, semicolon,
; and backslash
So it appears that Go is following the specification (the spec previously definesDQUOTE to mean double quote).
See this issue for further information.

Yes. Double quotes are allowed in cookies and other HTTP headers. Double quotes are also used to escape characters that would be otherwise invalid.
Here's a way to manually set a cookie, where w is the http.responseWriter and b is the bytes of the cookie. Max-Age is in seconds and 999999999 is ~31.69 years.
w.Header().Set("Set-Cookie", "your_cookie_name="+string(b)+"; Path=/; Domain=localhost; Secure; Max-Age=999999999; SameSite=Strict")
Since you'll probably use cURL to test sending cookies to the server, here's a useful test:
curl --insecure --cookie 'test1={"test1":"v1"}; test2={"test2":"v2"}' https://localhost:8081/
Which results in a HTTP header like the following:
GET / HTTP/2.0
Host: localhost:8081
Accept: */*
Cookie: test1={"test1":"v1"}; test2={"test2":"v2"}
User-Agent: curl/7.0.0
Also note, many of the RFC rules are ignored all the time, especially commas. For example, Chrome sets this header:
Sec-Ch-Ua: "Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"
Gasp! Is uses commas and other things! It is well established that commas in HTTP headers are okay, even though the RFC appears to say otherwise.
Note that the Chrome header uses spaces, many double quotes, commas, special characters, and semi colon. Chrome here uses double quotes around many values needing escaping, and that's the right way to be compliant with the RFC.
Note that the cookie RFC 6265 is a proposal and has been for 11 years. There are reasons for that. Knowing that cookies are just a HTTP header, look at the accepted HTTP RFC for quoted string semantic for header values:
HTTP RFC 7230 3.2.6.
These characters are included for quoted-string:
HTAB, SP
!#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~
%x80-FF
The characters " and / may be included if escaped.

Related

Why ruby controller would escape the parameters itself?

I am writing Ruby application for the back end service. There is a controller which would accept request from front-end.
Here is the case, there is a GET request with a parameter containing character "\n".
def register
begin
request = {
id: params[:key]
}
.........
end
end
The "key" parameter is passing from AngularJs as "----BEGIN----- \n abcd \n ----END---- \n", but in the Ruby controller the parameter became "----BEGIN----- \\n abcd \\n ----END---- \\n" actually.
Anyone has a good solution for this?
Yes, this is because of the ruby way to read the escape character. You can read the explanation right here: Escaping characters in Ruby
I got this issue once, and I just use gsub! to change the \\n to \n. What you should do is:
def register
begin
request = {
id: params[:key].gsub!("\\n", "\n")
}
.........
end
end
Remember, you have to use double quotation " instead of single quotation '. From the link I gave:
The difference between single and double quoted strings in Ruby is the way the string definitions represent escape sequences.
In double quoted strings, you can write escape sequences and Ruby will output their translated meaning. A \n becomes a newline.
In single quoted strings however, escape sequences are escaped and return their literal definition. A \n remains a \n.

Dealing with special character in Nokogiri / Regex

I am getting the text from the body of an HTML doc as below. When I try to regex scan for the term "Exhibit 99", I get an no matched, i.e, an empty array. However, in the html, I do see "Exhibit 99", although inspect element shows it with &nbsp99. How can I get rid of these HTML characters and search for "Exhibit 99" as if it were a regular string?
url = "https://www.sec.gov/Archives/edgar/data/1467373/000146737316000912/fy16q3plc8-kbody.htm"
doc = Nokogiri::HTML(open(url))
body = doc.css("body").text
body.scan(/exhibit 99/i)
Unicode character space
You can use :
body.scan(/exhibit\p{Zs}99/i)
From the documentation about Unicode character’s General Category:
/\p{Z}/ - 'Separator'
/\p{Zs}/ - 'Separator: Space'
It matches a whitespace or a non-breaking space, but no tab or newline. The string should be encoded in UTF-8. See this related question for more information.
non-word character
A more permissive regex would be :
body.scan(/exhibit\W99/i)
This allows any character other than a letter, a digit or an underscore between exhibit and 99. It would match a whitespace, a nbsp, a tab, a dash, ...

Is + a disallowed charecter in URLs?

What is disallowed in the following URL?
http://myPortfolio/Showcase/Kimber+Tisdale+Photography
I am getting The URI you submitted has disallowed characters. error message. Where as as far as I understand + is allowed, isn't it?
Reference: Which characters make a URL invalid?
It is an allowed character but not in the way you are using it. It is allowed in the query string part of a url, not in the url path names.
If you are just seperating words, it is more usual to use a hyphen or an underscore, or %20 for a space. You can use CI's url helper to encode strings for you:
$title = 'Kimber Tisdale Photography';
$url_title = url_title($title, '-');
// ouptut kimber-tisdale-photography
http://www.codeigniter.com/user_guide/helpers/url_helper.html#url_title
The + is allowed in URI paths.
You can check it yourself:
Visit the URI standard.
Check which characters are allowed in the Path component.
Note that every non-empty path may contain characters from the segment set.
Note that the segment set consists of characters from the pchar set.
Note that the pchar set contains characters from the sub-delims set.
And sub-delims is defined to consist of:
"!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
As you can see, the + is listed here.
(See my list of all allowed characters in URI paths.)
A prominent example of + in the path of HTTP(S) URIs are Google Plus profiles, e.g.:
https://plus.google.com/+MattCutts

Ruby: How to escape url with square brackets [ and ]?

This url:
http://gawker.com/5953728/if-alison-brie-and-gillian-jacobs-pin-up-special-doesnt-get-community-back-on-the-air-nothing-will-[nsfw]
should be:
http://gawker.com/5953728/if-alison-brie-and-gillian-jacobs-pin-up-special-doesnt-get-community-back-on-the-air-nothing-will-%5Bnsfw%5D
But when I pass the first one into URI.encode, it doesn't escape the square brackets. I also tried CGI.escape, but that escapes all the '/' as well.
What should I use to escape URLS properly? Why doesn't URI.encode escape square brackets?
You can escape [ with %5B and ] with %5D.
Your URL will be:
URL.gsub("[","%5B").gsub("]","%5D")
I don't like that solution but it's working.
encode doesn't escape brackets because they aren't special -- they have no special meaning in the path part of a URI, so they don't actually need escaping.
If you want to escape chars other than just the "unsafe" ones, pass a second arg to the encode method. That arg should be a regex matching, or a string containing, every char you want encoded (including chars the function would otherwise already match!).
If using a third-party gem is an option, try addressable.
require "addressable/uri"
url = Addressable::URI.parse("http://[::1]/path[]").normalize!.to_s
#=> "http://[::1]/path%5B%5D"
Note that the normalize! method will not only escape invalid characters but also perform casefolding on the hostname part, unescaping on unnecessarily escaped characters and the like:
uri = Addressable::URI.parse("http://Example.ORG/path[]?query[]=%2F").normalize!
url = uri.to_s #=> "http://example.org/path%5B%5D?query%5B%5D=/"
So, if you just want to normalize the path part, do as follows:
uri = Addressable::URI.parse("http://Example.ORG/path[]?query[]=%2F")
uri.path = uri.normalized_path
url = uri.to_s #=> "http://Example.ORG/path%5B%5D?query[]=%2F"
According to new IP-v6 syntax there could be urls like this:
http://[1080:0:0:0:8:800:200C:417A]/index.html
Because of this we should escape [] only after host part of the url:
if url =~ %r{\[|\]}
protocol, host, path = url.split(%r{/+}, 3)
path = path.gsub('[', '%5B').gsub(']', '%5D') # Or URI.escape(path, /[^\-_.!~*'()a-zA-Z\d;\/?:#&%=+$,]/)
url = "#{protocol}//#{host}/#{path}"
end

memcached client throws java.lang.IllegalArgumentException: Key contains invalid characters

Seems memcache client doesn't support UTF-8 string as its key. But I have to use i18n. Anyway to fix it?
java.lang.IllegalArgumentException: Key contains invalid characters: ``HK:00:A Kung Wan''
at net.spy.memcached.MemcachedClient.validateKey(MemcachedClient.java:232)
at net.spy.memcached.MemcachedClient.addOp(MemcachedClient.java:254)
The issue here isn't UTF encoding. It's the fact that your key contains a space. Keys cannot have spaces, new lines, carriage returns, or null characters.
The line of code that produces the exception is below
if (b == ' ' || b == '\n' || b == '\r' || b == 0) {
throw new IllegalArgumentException
("Key contains invalid characters: ``" + key + "''");
}
Base64 Encode your key just before passing them to memcached client's set() and get() methods.
A general solution to handle all memcached keys with special characters, control characters, new lines, spaces, unicode characters, etc. is to base64 encode the key just before you pass it to the set() and get() methods of memcached.
// pseudo code for set
memcachedClient.set(Base64.encode(key), value);
// pseudo code for get
memcachedClient.get(Base64.encode(key));
This converts them into characters memcached is guaranteed to understand.
In addition, base64 encoding has no performance penalty (unless you are a nano performance optimization guy), base64 is reliable and takes only about 30% extra length.
Works like a charm!

Resources