Firefox fails to decompress gzip files

Firefox fails to decompress gzip files - firefox

I have .gz files stored on AWS s3.
Using the s3 REST-API, I'm generating authenticated links that point to individual files. I'm also setting the content-header options such that browsers requesting these urls will decompress and download the gzipped files as attachments.
The generated s3 url looks like so:
https://MY_BUCKET.s3.amazonaws.com/PATH_TO/file.ext.gz
?AWSAccessKeyId=MY_KEY
&Expires=DATE_TIME
&Signature=MY_SIGNATURE
&response-content-disposition=attachment%3B%20filename%3D%22file.ext%22
&response-content-encoding=gzip
&response-content-type=application%2Foctet-stream
&x-amz-security-token=MY_TOKEN
The links behave as expected in: (All on OSX) Chrome (42.0.2311), Safari (8.0.6), Opera (29.0),
but NOT Firefox (38.0.1)
Firefox downloads and renames the file correctly but fails to decompress the gzipped file.
The response headers of a GET request to the authenticated urls look like so:
Accept-Ranges:bytes
Content-Disposition:attachment; filename="file.ext"
Content-Encoding:gzip
Content-Length:928
Content-Type:application/octet-stream
Date:SOME_DATE_TIME
ETag:"MY_ETAG"
Last-Modified:SOME_OTHER_DATE_TIME
Server:AmazonS3
x-amz-expiration:expiry-date="ANOTHER_DATE_TIME"
x-amz-id-2:MY_AMZ_ID
x-amz-request-id:MY_AMZ_REQUEST_ID
x-amz-server-side-encryption:AES256
Does Firefox look for different headers and/or header values to indicate decompression?

The solution appears to be removing .gz from the end of the filename.
It's a common misconfiguration to set Content-Encoding: gzip on .gz files when you intend for the end user to download -- and end up with -- a .gz file; e.g. downloading a .tar.gz of source package.
This isn't what you are doing... It's the opposite, essentially... but I suspect you're seeing a symptom of an attempt to address that issue.
In fact, the configuration I described should only be the case when you gzipped an already-gzipped file (which, of course, you shouldn't do)... but it was entrenched for a long time by (iirc) default Apache web server configurations. Old bug reports seem to suggest that the Firefox developers had a hard time grasping what should be done with Content-Encoding: gzip, particularly with regard to downloads. They were a bit obsessed, it seems, with the thought that the browser should not undo the content encoding when saving to disk, since saving to disk wasn't the same as "rendering" the downloaded content. That, to me, is nonsense, a too-literal interpretation of an RFC.
I suspect what you see is a legacy of that old issue.
Contrary to your conception, it's quite correct to store a file with Content-Encoding: gzip without a .gz extension... arguably, in fact, it's more correct to store such content without a .gz extension, because the .gz implies (at least to Firefox, apparently) that the downloading user should want the compressed content downloaded and saved in the compressed form.

1. Background to compressed content
Michael's changing of the file extension solves the problem is because the important step is to change the Content-Type header to reflect the underlying content within the compressed file, rather than that of the compressed file itself.
In many webservers, the mime types are detected based on file extension - for example, you may have a mime type of application/gzip corresponding to .gz file extensions (on a default Debian install of nginx, this can be found within /etc/nginx/mime.types). Your server will then set headers as Content-Type: application/gzip for the files matching this mime type.
If your browser receives a Content-Type header suggesting binary compressed content is on it's way rather than the text within the compressed file, it will assume it's not for human consumption and may not display it. Mine (Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0) didn't.
2. Header adjustment
Set a Content-Encoding: 'gzip' header
Set Content-Type: 'text/plain' header for files you want displayed in plain text
The browser (if gzip compression is supported), should decompress and display the content for the client.
3. Real world example
/usr/share/doc contains text documentation, many of which have also been gzip compressed.
By adding the following to the nginx server {} block, you can enable transparent decompression on the client:
# local documentation access
location /doc {
alias /usr/share/doc;
autoindex on; # allow dir listings
allow 127.0.0.1; deny all; # anyone outside is forbidden
# display .gz content in text on the browser
location ~ \.gz {
default_type text/plain;
add_header Content-Encoding: 'gzip';
}
}

Related

Golang fileserver setting content-type differently on linux and macos

I am using the http.FileServer in my web service, and when I try serving a javascript file from it, I will get a content-type header of text/javascript; charset=utf-8 on Linux (debian 11), but application/javascript on MacOS 13.
Go version is 1.19.1 on linux, and 1.19.3 on MacOS. On both machines I set LANG=en_GB.UTF-8 in the environment the web service runs in.
Interestingly, when serving other text files, e.g. a HTML file, I will get text/html; charset=utf-8 on both MacOS and Linux.
What is the reason for this? It makes my unit tests fail on MacOS, and I would prefer to test for the full content-type including character set.

http.FileServer uses the filename's extension to determine the Content-Type if it's not set. That in turn calls mime.TypeByExtension().
The documentation for mime.TypeByExtension() says that the mapping is augmented by the system's MIME.info database. Those are likely different between Linux and MacOS.
#Andrei Vasilev notes that you can override the default mime types with AddExtensionType().
Alternatively, you could update the appropriate local mime.types file to make them return the same type. On my MacOS 12.6.1 with go1.19.1 darwin/arm64, I have apache installed and the return value of:
mime.TypeByExtension(".js")
is from /etc/apache2/mime.types.

Windows Azure Blob Storage download with Shared Access Signature and CUSTOM response header

I use the Windows Azure Blob Storage to keep files there.
To download files i create urls with Shared Access Signature.
It works fine, but there is one problem.
Some files (blobs) have the header "Content-Type" set during upload and other no.
if a file has no Content-Type than on request to Azure the response will have the header Content-Type: application/octet-stream . This is exactly what i need, because in such case a browser will show "Download dialog" for a user.
But for files where this header was set on upload, it is returned and sometimes it makes a problem. For example, Content-Type: images/jpeg makes a browser to show this image, but not download it (does not show Download dialog)
So, my question is
is there a way on download with presigned url from WIndows Azure to force to use some specific response header?
I want it behave like there is no Content-Type saved for a file, even if it is saved

So, after some time browsing i finally found the documentation about it.
There are references.
https://nxt.engineering/en/blog/sas_token/
https://learn.microsoft.com/en-us/rest/api/storageservices/service-sas-examples
https://learn.microsoft.com/en-us/rest/api/storageservices/create-service-sas
For me it was needed to up the version of the API (i used the 2012 API version).
Also one useful note. It is very sensetive to a date format. The expiraton time must be in the format like "2021-11-16T04:25:00Z" .
I have added 2 new arguments
'rscd=file;%20attachment&rsct=binary&'.
and both of the must be in the signature string to sign on their correct places

So, my question is is there a way on download with presigned url from
WIndows Azure to force to use some specific response header? I want it
behave like there is no Content-Type saved for a file, even if it is
saved
Yes, you can override Content-Disposition response header in your SAS Token and the blob will be always downloaded regardless of it’s content type.
You can override this header to a value like attachment; filename=yourdesiredfilename and the blob will always be downloaded with yourdesiredfilename name.

Web Api gzip compression

I have used this URL for the web API compression but when I see the out put in the fiddler header is not zip. there are multiple Zip option are available example GZIP, BZIP2 DEFLATE not sure which one to use kindly help here
I have tried with the below solution and both of them are not working :
http://benfoster.io/blog/aspnet-web-api-compression

there are multiple Zip option are available example GZIP, BZIP2 DEFLATE not sure which one to use kindly help here
This list will be sent to the server and let it know about the client side preferences about compression. It means "I first prefer GZIP. If GZIP not supported by the server side then fallback to BZIP2 DEFLATE compression. If BZIP2 DEFLATE not supported then the server will not do any compression."
There is someone who already create a nuget package that use that implementation you just put in your question. The package name is Microsoft.AspNet.WebApi.MessageHandlers.Compression which install the following two packages :
Microsoft.AspNet.WebApi.Extensions.Compression.Server
System.Net.Http.Extensions.Compression.Client
If you don't need the client side library then just just the server side package in your Web API project.
To use it you need to modify to add the following line at the end of your Application_Start method in Gloabl.asax.cs:
GlobalConfiguration.Configuration.MessageHandlers.Insert(0, new ServerCompressionHandler(new GZipCompressor(), new DeflateCompressor()));
To learn more about this package check this link.

What's the fastest way to upload an image to a webserver?

I am building an application which will allow users to upload images. Mostly, it will work with mobile browsers with slow internet connections. I was wondering if there are best practices for this. Does doing some encryption and than doing the transfer and decoding on server is a trick to try ? OR something else?

You would want something preferably with resumable uploads. Since your connections is slow you'd need something that can be resumed where you left off. A library i've come across over the many years is Nginx upload module:
http://www.grid.net.ru/nginx/upload.en.html
According to the site:
The module parses request body storing all files being uploaded to a directory specified by upload_store directive. The files are then being stripped from body and altered request is then passed to a location specified by upload_pass directive, thus allowing arbitrary handling of uploaded files. Each of file fields are being replaced by a set of fields specified by upload_set_form_field directive. The content of each uploaded file then could be read from a file specified by $upload_tmp_path variable or the file could be simply moved to ultimate destination. Removal of output files is controlled by directive upload_cleanup. If a request has a method other than POST, the module returns error 405 (Method not allowed). Requests with such methods could be processed in alternative location via error_page directive.

Curl is uncompressing a compressed file when I didn't ask it to

This is the opposite of the issue that all my searches kept coming up with answers to, where people wanted plain text, but got compressed.
I'm writing a bash script that uses curl to fetch the mailing list archive files from a Mailman mailing list (using the standard Mailman web interface on the server end).
The file (for this month) is http://lists.example.com/private.cgi/listname-domain.com/2013-September.txt.gz (sanitized URL).
When I save this with my browser I get, in fact, a gzipped text file, which when ungzipped contains what I expect.
When I fetch it with Curl (after previously sending the login password and getting a cookie set, and saving that cookie file to use in the request), though, what comes out stdout (or is saved to a -o file) is the UNCOMPRESSED text.
How can I get Curl to just save the data into a file like my browser does? (Note that I am not using the --compressed flag in my Curl call; this isn't a question of the server compressing data for transmission, it's a question of downloading a file that's compressed on the server disk and I want to keep it compressed.)
(Obviously I can hack around this by re-compressing it in my bash script. Waste of CPU resources, and a problem waiting to happen in the future, though. Or I can leave it uncompressed, and hack the name and store it as just September.txt; that wastes disk space instead. Again, that would break if the behavior changed in the future, though. The problem seems to me to be that Curl is getting confused between compressed transmittal, and and actual compressed data.)

Is it possible the server is decompressing the file based on headers sent (or not sent) by curl? Try the following header with curl:
--header 'Accept-Encoding: gzip,deflate'

You can download the *.txt.gz directly, without any uncompressing, with 'wget' instead of 'curl'.
wget http://lists.example.com/private.cgi/listname-domain.com/2013-September.txt.gz
If curl is essential, then check out the details here

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio