Web-serving a file : Firefox truncates name - firefox

I'm serving a file from Lighttpd whose name contains space-characters. I'm using mimetype "application/octet-stream"
When I download this in Chrome, it works perfectly. But when I download in Firefox, the filename is truncated at the first space.
Is this to do with the mimetype? With some other lightty config? Or maybe something to do with the kind of space-character I'm using?

You need to urlencode your links. Spaces are easily misunderstood in URLS.
The url code for space is %20 or you can also use +
So for http://example.com/test file.jpg
You would use:
http://example.com/test%20file.jpg
or
http://example.com/test+file.jpg
I prefer to not use filenames with spaces in them.

Related

Server.URLEncode started to replace blank with plus ("+") instead of percent-20 ("%20")

Given this piece of code:
<%
Response.Write Server.URLEncode("a doc file.asp")
%>
It output this for a while (like Javascript call encodeURI):
a%20doc%20file.asp
Now, for unknow reason, I get:
a+doc+file%2Easp
I'm not sure of what I touched to make this happen (maybe the file content encoding ANSI/UTF-8). Why did this happen and how can I get the first behavior of Server.URLEncode, ie using a percent encoding?
Classic ASP hasn't been updated in nearly 20 years, so Server.URLEncode still uses the RFC-1866 standard, which specifies spaces be encoded as + symbols (which is a hangover from an old application/x-www-form-urlencoded media type), you must be mistaken in thinking it was encoding spaces as %20 at some point, not unless there's an IIS setting you can change that I'm unaware of.
More modern languages use the RFC-3986 standard for encoding URLs, which is why Javascript's encodeURI function returns spaces encoded as %20.
Both + and %20 should be treated exactly the same when decoded by any browser thanks to RFC backwards compatibility, but it's generally considered best to use %20 when encoding spaces in a URL as it's the more modern standard now, and some decoding functions (such as Javascript's decodeURIComponent) won't recognise + symbols as spaces and will fail to properly decode URLs that use them over %20.
You can always use a custom function to encode spaces as %20:
function URL_encode(ByVal url)
url = Server.URLEncode(url)
url = replace(url,"+","%20")
URL_encode = url
end function

breakable slashes everywhere but URLs

I generate pdf (latex) from restructured text using python sphinx (1.4.6) .
I use narrow table column headers with texts like "stuff/misc/other". I need the slashes to be breakable, so the table headers don't overflow into the next column.
The LaTeX solution is to use \BreakableSlash or \slash where necessary. I can use python code to replace all slashes:
from sphinx.util.texescape import tex_replacements
# \BreakableSlash needs package hyphenat to be loaded
tex_replacements.append((u'/', ur'\BreakableSlash ') )
# tex_replacements.append((u'/', ur'\slash ') )
But that will break any URL like http://www.example.com/ into something like
http:\unhbox\voidb#x\penalty\#M\hskip\z#skip/\discretionary{-}{}{}\penalty\#M\hskip\z#skip\unhbox\voidb#x\penalty\#M\hskip\z#skip/\discretionary{-}{}{}\penalty\#M\hskip\z#skipwww.example.com
or
http:/\penalty\exhyphenpenalty/\penalty\exhyphenpenaltywww.example.com
I'd like to use a general solution that works in both cases, where the editor of the documentation can still use normal ReST and doesn't have to worry about latex.
Any idea how to get classic slashes in URLs and breakable slashes everywhere else?
You have not really given data and source code and only asked for an idea, so I take the liberty of only sketching a solution in pseudo code:
Split the document into a list of strings at each position of a space using .split()
For each string, check whether it is an URL by comparing its left side to http:// (and maybe also ftp://, https:// or similar tags)
Do replacements, but only in strings which are no URLs
Recombine all strings including the spaces again, using a command such as " ".join(my_list)
One way to do it, might be to write a Transform subclass. And then use add transform in setup(app) to use it in every read.
I could use DefaultSubstitutions from transforms.py as template for my own class.

How to reversibly escape a URL in Ruby so that it can be saved to the file system

The use-case example is saving the contents of http://example.com as a filename on your computer, but with the unsafe characters (i.e. : and /) escaped.
The classic way is to use a regex to strip all non-alphanumeric-dash-underscore characters out, but then that makes it impossible to reverse the filename into a URL. Is there a way, possibly a combination of CGI.escape and another filter, to sanitize the filename for both Windows and *nix? Even if the tradeoff is a much longer filename?
edit:
Example with CGI.escape
CGI.escape 'http://www.example.com/Hey/whatsup/1 2 3.html#hash'
#=> "http%3A%2F%2Fwww.example.com%2FHey%2Fwhatsup%2F1+2+3.html%23hash"
A couple things...are % signs completely safe as file characters? Unfortunately, CGI.escape doesn't convert spaces in a malformed URL to %20 on the first pass, so I suppose any translation method would require changing all spaces to + with a gsub and then applying CGI.escape
One of the ways is by "hashing" the filename. For example, the URL for this question is: https://stackoverflow.com/questions/18234870/how-to-reversibly-escape-a-url-in-ruby-so-that-it-can-be-saved-to-the-file-syste. You could use the Ruby standard library's digest/md5 library to hash the name. Simple and elegant.
require "digest/md5"
foldername = "https://stackoverflow.com/questions/18234870/how-to-reversibly-escape-a-url-in-ruby-so-that-it-can-be-saved-to-the-file-syste"
hashed_name = Digest::MD5.hexdigest(foldername) # => "5045cccd83a8d4d5c4fc01f7b4d8c502"
The corollary for this scheme would be that MD5 hashing is used to validate the authenticity/completeness of downloads since for all practical purposes, the MD5 digest of the string always returns the same hex-string.
However, I won't call this "reversible". You need to have a custom way to look up the URLs for each of the hashes that get generated. May be, a .yml file with that data.
update: As #the Tin Man suggests, a simple SQLite db would be much better than a .yml file when there are a large number of files that need storing.
here is how I would do it (adjust the regular expression as needed):
url = "http://stackoverflow.com/questions/18234870/how-to-reversibly-escape-a-url-in-ruby-so-that-it-can-be-saved-to-the-file-syste"
filename = url.each_char.map {|x|
x.match(/[a-zA-Z0-9-]/) ? x : "_#{x.unpack('H*')[0]}"
}.join
EDIT:
if the length of the resulting file name is a concern then I would store the files in sub-directories with the same names as the url path segments.

Why Firefox wont transfer %20 to space (' ')?

I am sending to the browser a request to save a file with the file name.
The file name might include spaces, so i replace all spaces with %20.
Internet Explorer and Chrome transfers %20 back to spaces, but Firefox does not to that. why?
Is there a way make all browsers show the space?
This is my code:
String codedName = new String(URLEncoder.encode(name, "UTF-8"));
codedName = codedName.replaceAll("\\+", "%20");
response.setHeader("Content-Disposition", "attachment; filename=\"" + codedName+ "\"");
That depends on how you create the file name. Usually, you can simply set the file name in the header field and the framework will encode it properly. In your case, you seem to encode the name twice. Try without encoding it.
You may use Javascript to encode the url.
The Syntax for encoding URL's in JavaScript is:
encodeURI(uri)
So, the code would be: (Note the space in-between my and test.)
<script type="text/javascript">
var uri="my test.html?name=jason&age=25";
document.write(encodeURI(uri)+ "<br />");
</script>
Which results in:
my%20test.html?name=jason&age=25
As per your recent comment "How do I do it in Java?"
The syntax would be something like:
encode(String s)
A simple Google search would reveal more information.

How to judge if a URL is already encoded with encodeURI?

I'm trying to do it in VBScript/JScript, to avoid re-encoding.
Should I judge if there is "%" ? Does "%" have other uses in URL?
Thanks.
Edit: Oh, the original encoding function may not be encodeURI.
I'm trying to collect URLs from the browser, and store them after encoding with encodeURI.
But if the URL is already encoded, another encoding will make it wrong.
I might try decoding it and comparing the result to the original URL. If it changed or got shorter in length your original URL was probably already encoded.
iterate over the chars in the url and test for characters that aren't allowed
in an url.
if there are any encode it.
if there aren't any illegal characters, it doesn't matter

Resources