I tried to find a solution for this issue but nothing worked. When my REST api URI request is, ex. https://serverip/meeting/userlist/0
I always get the error "The URI you submitted has disallowed characters”. I have even tried to leave this parameter in the config file blank:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_-+';
But I get the same error.
Is not allowed to have a 0 at the end of the URI as unique content of that segment? Because I need that to retrieve user with id = 0.
Thanks a lot.
EDIT - SOLVED:
Hi Again,
finally I solved it. I found that long time ago we commented a check related to UTF8 encodig in URI.php
if ( ! empty($str) && ! empty($this->_permitted_uri_chars) && ! preg_match('/^['.$this->_permitted_uri_chars.']+$/i'.(UTF8_ENABLED ? 'u' : ''), $str))
And we only left the first condition. We had some code issues that seem not to reproduce after revert that comment. And /0 now works fine.
So sorry, at the end it was a problem related to our own modifications.
Thanks.
$config['permitted_uri_chars'] is used as a PCRE character class pattern.
With the last character in there being a dash, it looks for a dash. However, when a dash is between two characters, it triggers a range search. So ... when you append the + (plus) sign after the dash, you get:
[_-+] // a range between underscore and plus in the ASCII table
You might be thinking "So what? Zeros are already allowed previously via 0-9", and you'd be correct, but that's not the problem. The problem is that the plus sign has a lower ASCII number than the underscore, and ranges don't work backwards, so _-+ is invalid and triggers a PCRE compilation failure, which in turn means the entire check fails and nothing is actually allowed.
You would see this if you had error_reporting enabled and/or looked at the error logs.
This doesn't happen if you only append the plus sign to the default pattern - the dash is not only the last character, but also escaped with a backslash - as you'd have this instead:
[_\-+] // Underscore, dash and plus sign as individual characters; not a range
I guess you thought it was an actual character to be allowed and removed it. Just add it back:
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-+';
Related
I have code which lists keys using ListObjectsPages->Contents->Key and copies those keys using CopyObject. This works in general but for some keys it's complaining NoSuchKey: The specified key does not exist. The set of keys it's complaining about include keys with +.
ListObjectsPages returns key "foo+bar".
CopyObject on "foo+bar" gives the NoSuchKey error.
CopyObject on "foo bar" (unescaped) gives the NoSuchKey error.
Oddly, if I use the CLI: aws s3 cp on "foo+bar", the copy works. But I can't use the CLI. I need to use the sdk.
I'm using v1.8.11
As Rayfen mentioned, the plus characters could be the result of space replacement.
Update:
Everything was hashed out here https://github.com/aws/aws-sdk-go/issues/1438. Rayfen was right about needing to QueryEscape. I'm going to award the only current answer with the bounty since it adds useful information, but not select it as correct.
The object key and metadata document is clear:
The following character sets are generally safe for use in key names:
Alphanumeric characters [0-9a-zA-Z]
Special characters !, -, _, ., *, ', (, and )
Not only + would be converted into space with, but, from the section "Characters That Might Require Special Handling" of the same page, ':' should also be converted back from space, which QueryUnescape does not do (it only convert space back to +).
Check if your keys include other special characters to be handled with care, like : (also replaced by space), # or = (replaced by ;), or , and ?.
Check in particular if the key obtained from QueryUnescape has a + instead of a ':' in the original key: that could be a space incorrectly "unescaped".
This details what was happening: https://github.com/aws/aws-sdk-go/issues/1438
The main issue was that CopySource needs to be url-encoded and not the Key field, which was surprising to me. (I was url-encoding both.)
The other issue was that I was using path.Join which strips out trailing \. This is a problem because s3 keys can have trailing \ - which represents a sort of folder.
In using the Page Object gem, I'm trying to pull text from a page to verify error messages. One of these error messages contains double-quotes, but when the page object pulls the text from the page, it pulls some other characters.
expected ["Please select a category other than the Default â?oEMSâ?? before saving."]
to include "Please select a category other than the Default \"EMS\" before saving."
(RSpec::Expectations::ExpectationNotMetError)
I'm not quite sure how to escape these - I'm not sure where I could use Regexs and be able to escape these odd characters.
Honestly you are over complicating your validation.
I would recommend simplifying what you are trying to do, start by asking yourself: Is the part in quotes a critical part of your validation?
If it is, isolate it by doing a String.contains("EMS")
If it is not, then you are probably doing too much work, only check for exactly what you need in validation:
String.beginsWith("Please select a category other than the Default")
With respect to the actual issue you are having, on a technical level you have an encoding issue. Encode your result string with utf-8 before you pass it to your validation and you will be fine.
Good luck
It's pretty likely that somewhere along the line encoded the string improperly. (A tipoff is the accented characters followed by ?.) It seems pretty likely that the quotes were converted to "smart quotes" somewhere. This table compares Window-1252 to UTF-8:
Code Point Characters UTF-8 Bytes
Unicode Windows
1252 Expected Actual
------ ---- - --- -----------
U+201C 0x93 “ “ %E2 %80 %9C
U+201D 0x94 ” †%E2 %80 %9D
What you'll want to do is spot check various places in the code to find the first place the string is encoded in something other than UTF-8:
puts error_str.encoding
(For clarity, error_str is the variable that holds the string you are testing. I'm using puts, but you might want have another way to log diagnostic messages.)
Once you find the string that's not encoded UTF-8, you can convert it:
error_str.encode('UTF-8')
Or, if the string is hardcoded somewhere, just replace the string.
For more debugging advice, see: 3 Steps to Fix Encoding Problems in Ruby and How to Get From They’re to They’re.
Every time I get an error when validating:
<iframe class="forecast" src="http://forecast.io/embed/#lat=-26.201560&lon=28.038995&name=Johannesburg,%20ZA&text-color=#ffffff&color=#ffffff&font=Helvetica&units=ca"></iframe>
Error (screenshot):
http://postimg.org/image/5h1kvzzuh/
I escaped the characters, but it didn't works.
Thanks.
W3C validator maintainer here. Short answer is, use instead the following:
<iframe class="forecast" src="http://forecast.io/embed/%23lat=-26.201560&lon=28.038995&name=Johannesburg,%20ZA&text-color=#ffffff&color=%23ffffff&font=Helvetica&units=ca"></iframe>
That is, the fix is just to replace # with %23 (the percent-encoding of the # character).
Explanation
The specific problem in that URL is the # character references it contains.
# is # (the “number-sign” or “hash” character), which is not a valid URL code point per the URL Standard, and so it’s not allowed in a URL.
The # character is only ever allowed in an absolute URL with fragment or relative URL with fragment—and then, explicitly allowed only after the part the URL spec defines as the actual URL.
And for the purposes of URLs, # and # are exactly the same.
Hence, you must use it as %23 (that is, percent-encoded).
P.S. I plan to get the URL checker in the validator updated to actually report the particular illegal characters it finds in URLs but it will be a while yet before I can get that refinement made.
I am sending tokens via a POST request, but when I see them on the server it doesn't match up with what was sent.
"U2FsdGVkX1+pxBHFdSU4NiSIOdR2GCCBr/WF7AOSF5zQjRqjSoTeOKR0Dzwm\nNT+g\n" <-- Original
"U2FsdGVkX1+pxBHFdSU4NiSIOdR2GCCBr/WF7AOSF5zQjRqjSoTeOKR0Dzwm\\nNT+g\\n" <-- Result
Notice that the \n has been replaced with \\n. When I do the token lookup verification, of course, no result is found because the string I'm looking for is not the proper string anymore!
I'm not sure why this string is being auto changed like this or quite how to correct it. I'm just accessing this through the standard params like so.
token.verify(params["token"])
EDIT for further clarity
I'm viewing this from the terminal using the debugger gem. I have autoeval enabled and display with params["token"] without p or puts. I am not trying to create newline characters with \n. The literal \n is an actual part of the string that is received in the post. I randomly generate a token using a hashing and encryption library and the strings sometimes end up with these characters in them. If I run token.verify(params["token"]) from the debugger terminal I get nil back from the database as there is no match due to the extra backslash characters being added into the string.
If I directly run token.verify("U2FsdGVkX1+pxBHFdSU4NiSIOdR2GCCBr/WF7AOSF5zQjRqjSoTeOKR0Dzwm\nNT+g\n") from the debugger terminal I get the correct record back from the database. This leaves me thinking that either Rack or Sinatra is auto escaping the "special" characters in the string before I get a chance to even touch it.
This has something to with the way Ruby is handling special characters. From irb you can see this with a quick check like this.
"\\n" == '\n'
Unexpectedly; at least to me, this returns true as they are treated the same. Rather than trying to deal with special characters coming across the wire I ended up just base 64 encoding everything.
I've written a more detailed post about this on my blog at:
http://idisposable.co.uk/2010/07/chrome-are-you-sanitising-my-inputs-without-my-permission/
but basically, I have a string which is:
||abcdefg
hijklmn
opqrstu
vwxyz
||
the pipes I've added to give an indiciation of where the string starts and ends, in particular note the final carriage return on the last line.
I need to put this into a hidden form variable to post off to a supplier.
In basically, any browser except chrome, I get the following:
<input type="hidden" id="pareqMsg" value="abcdefg
hijklmn
opqrstu
vwxyz
" />
but in chrome, it seems to apply a .Trim() or something else that gives me:
<input type="hidden" id="pareqMsg" value="abcdefg
hijklmn
opqrstu
vwxyz" />
Notice it's cut off the last carriage return. These carriage returns (when Encoded) come up as %0A if that helps.
Basically, in any browser except chrome, the whole thing just works and I get the desired response from the third party. In Chrome, I get an 'invalid pareq' message (which suggests to me that those last carriage returns are important to the supplier).
Chrome version is 5.0.375.99
Am I going mad, or is this a bug?
Cheers,
Terry
You can't rely on form submission to preserve the exact character data you include in the value of a hidden field. I've had issues in the past with Firefox converting CRLF (\r\n) sequences into bare LFs, and your experience shows that Chrome's behaviour is similarly confusing.
And it turns out, it's not really a bug.
Remember that what you're supplying here is an HTML attribute value - strictly, the HTML 4 DTD defines the value attribute of the <input> element as of type CDATA. The HTML spec has this to say about CDATA attribute values:
User agents should interpret attribute values as follows:
Replace character entities with characters,
Ignore line feeds,
Replace each carriage return or tab with a single space.
User agents may ignore leading and trailing white space in CDATA attribute values (e.g., " myval " may be interpreted as "myval"). Authors should not declare attribute values with leading or trailing white space.
So whitespace within the attribute value is subject to a number of user agent transformations - conforming browsers should apparently be discarding all your linefeeds, not only the trailing one - so Chrome's behaviour is indeed buggy, but in the opposite direction to the one you want.
However, note that the browser is also expected to replace character entities with characters - which suggests you ought to be able to encode your CRs and LFs as
and
, and even spaces as , eliminating any actual whitespace characters from your value field altogether.
However, browser compliance with these SGML parsing rules is, as you've found, patchy, so your mileage may certainly vary.
Confirmed it here. It trims trailing CRLFs, they don't get parsed into the browser's DOM (I assume for all HTML attributes).
If you append CRLF with script, e.g.
var pareqMsg = document.forms[0]['pareqMsg']
if (/\r\n$/.test(pareqMsg.value) == false)
pareqMsg.value += '\r\n';
...they do get maintained and POSTed back to the server. Although the hidden <textarea> idea suggested by Gaby might be easier!
Normally in an input box you cannot enter (by keyboard) a newline.. so perhaps chrome enforces this even for embedded, through the attributes, values ..
try using a textarea (with display:none)..