When should I escape urls? - ruby

I have a URL and escaped it using:
url = "http://ec4.images-xxx.com/images/I/41-%2B6wMiewL._SL135_.jpg"
url = URI.escape(url)
puts url => "http://ec4.images-xxx.com/images/I/41-%252B6wMiewL._SL135_.jpg"
From the result I can see that URI escaped the previously escaped %2B again which became %252B, which is not correct.
I want to know how to make sure when one URL should be escaped. Or, is there a smart method that knows when to escape and when not to escape?

Your first string is already properly URI encoded, so when you try to re-encode it, the URI.escape method is encoding the '%' with '%25' (URI encoding for '+').
If you're really not sure whether your string has been URI encoded or not, you could try to decode it first, and compare it with the original. If they're the same, then it hasn't been encoded.

Related

How to disable double escaping url in golang?

It looks like golang reverseproxy double escapes url, when making a http request,
server receives:
/id/EbnfwIoiiXbtr6Ec44sfedeEsjrf0RcXkJneYukTXa%252BIFVla4ZdfRiMzfh%252FEGs7f
expected:
/id/EbnfwIoiiXbtr6Ec44sfedeEsjrf0RcXkJneYukTXa%2BIFVla4ZdfRiMzfh%2FEGs7f
Is there a way to avoid double escaping?
Go isn't double-escaping the URL. It's constructing the URL given the values that you gave it (which means escaping the path). You've already escaped the path. Don't do that. If you want to send +, then use +. Go will escape it correctly.
If you have an escaped path already for some reason, then unescape it with url.PathUnescape() before constructing the URL.

Server.URLEncode started to replace blank with plus ("+") instead of percent-20 ("%20")

Given this piece of code:
<%
Response.Write Server.URLEncode("a doc file.asp")
%>
It output this for a while (like Javascript call encodeURI):
a%20doc%20file.asp
Now, for unknow reason, I get:
a+doc+file%2Easp
I'm not sure of what I touched to make this happen (maybe the file content encoding ANSI/UTF-8). Why did this happen and how can I get the first behavior of Server.URLEncode, ie using a percent encoding?
Classic ASP hasn't been updated in nearly 20 years, so Server.URLEncode still uses the RFC-1866 standard, which specifies spaces be encoded as + symbols (which is a hangover from an old application/x-www-form-urlencoded media type), you must be mistaken in thinking it was encoding spaces as %20 at some point, not unless there's an IIS setting you can change that I'm unaware of.
More modern languages use the RFC-3986 standard for encoding URLs, which is why Javascript's encodeURI function returns spaces encoded as %20.
Both + and %20 should be treated exactly the same when decoded by any browser thanks to RFC backwards compatibility, but it's generally considered best to use %20 when encoding spaces in a URL as it's the more modern standard now, and some decoding functions (such as Javascript's decodeURIComponent) won't recognise + symbols as spaces and will fail to properly decode URLs that use them over %20.
You can always use a custom function to encode spaces as %20:
function URL_encode(ByVal url)
url = Server.URLEncode(url)
url = replace(url,"+","%20")
URL_encode = url
end function

How can i send a parameter with space to .net web api

I would like to receive a long string the contains spaces to my method in my web api
To my understanding i can't send a parameter with white spaces, does it have to be encoded in some way?
EDIT:
My content type is:
Content-Type: application/x-www-form-urlencoded
I've changed it to several other types but none of them allows me to receive a parameter with + instead of spaces
my post method signature is
public HttpResponseMessage EditCommentForExtension(string did, string extention, string comment)
Usually, parameters to an HTTP GET request are URL encoded. This means (among other) that spaces are replaced by "+".
Using + to mean "space" in a URL is an internal convention used by some web sites, but it's not part of the URL encoding standard. If you want to use + to means spaces, you are going to have to convert them yourself.
As you discovered, spaces (like everything else that needs encoding) should be encoded with %XX where X standards for a hex digit.
http://www.w3.org/Addressing/rfc1738.txt
The only thing that work for me is to add %20 instead of the spaces

Why doesn't URI.escape escape single quotes?

Why doesn't URI.escape escape single quotes?
URI.escape("foo'bar\" baz")
=> "foo'bar%22%20baz"
For the same reason it doesn't escape ? or / or :, and so forth. URI.escape() only escapes characters that cannot be used in URLs at all, not characters that have a special meaning.
What you're looking for is CGI.escape():
require "cgi"
CGI.escape("foo'bar\" baz")
=> "foo%27bar%22+baz"
This is an old question, but the answer hasn't been updated in a long time. I thought I'd update this for others who are having the same problem. The solution I found was posted here: use ERB::Util.url_encode if you have the erb module available. This took care of single quotes & * for me as well.
CGI::escape doesn't escape spaces correctly (%20) versus plus signs.
According to the docs, URI.escape(str [, unsafe]) uses a regexp that matches all symbols that must be replaced with codes. By default the method uses REGEXP::UNSAFE. When this argument is a String, it represents a character set.
In your case, to modify URI.escape to escape even the single quotes you can do something like this ...
reserved_characters = /[^a-zA-Z0-9\-\.\_\~]/
URI.escape(YOUR_STRING, reserved_characters)
Explanation: Some info on the spec ...
All parameter names and values are escaped using the [rfc3986]
percent- encoding (%xx) mechanism. Characters not in the unreserved
character set ([rfc3986] section 2.3) must be encoded. characters in
the unreserved character set must not be encoded. hexadecimal
characters in encodings must be upper case. text names and values must
be encoded as utf-8 octets before percent-encoding them per [rfc3629].
I know this has been answered, but what I wanted was something slightly different, and I thought I might as well post it up: I wanted to keep the "/" in the url, but escape all the other non-standard characters. I did it thus:
#public filename is a *nix filepath,
#like `"/images/isn't/this a /horrible filepath/hello.png"`
public_filename.split("/").collect{|s| ERB::Util.url_encode(s)}.join("/")
=> "/images/isn%27t/this%20a%20/horrible%20filepath/hello.png"
I needed to escape the single quote as I was writing a cache invalidation for AWS Cloudfront, which didn't like the single quotes and expected them to be escaped. The above should make a uri which is more safe than the standard URI.escape but which still looks like a URI (CGI Escape breaks the uri format by escaping "/").

How to judge if a URL is already encoded with encodeURI?

I'm trying to do it in VBScript/JScript, to avoid re-encoding.
Should I judge if there is "%" ? Does "%" have other uses in URL?
Thanks.
Edit: Oh, the original encoding function may not be encodeURI.
I'm trying to collect URLs from the browser, and store them after encoding with encodeURI.
But if the URL is already encoded, another encoding will make it wrong.
I might try decoding it and comparing the result to the original URL. If it changed or got shorter in length your original URL was probably already encoded.
iterate over the chars in the url and test for characters that aren't allowed
in an url.
if there are any encode it.
if there aren't any illegal characters, it doesn't matter

Resources