Ruby Mechanize Zlib::BufError - ruby

Not sure why I'm getting this error now with the Mechanize gem - been using it for a while now with no issues.
My script will randomly stop and throw the following error:
/Users/username/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:798:in `rescue in response_content_encoding': error handling content-encoding gzip: buffer error (Zlib::BufError) (Mechanize::Error)
Any ideas?

It's possible that you're hitting a URL that points to a load-balancer. One of the hosts behind that load-balancer is mis-configured, or perhaps it's configured differently than its peers, and is returning a gzipped version of the content, where others aren't. I've seen that problem in the past.
I've also see situations where the server said it was returning gzipped content, but sent it uncompressed. Or it could be sending zipped, not gzipped. The combinations are many.
The fix is to be sure your code is capable of sensing whether the returned content is compressed. Make sure you're sending the correct acceptable-content HTTP headers for your code to their server too. You have to program defensively and look at the actual content you get back, and then branch to do the right decompression, then pass that on for parsing.

I was able to get around this by setting the request headers like the following:
mechanize.request_headers = { "Accept-Encoding" => "" }

Related

Ruby progressbar with down gem

I am implementing a file downloader by using the down gem.
I need to add a progress bar to my program for fancy outputs. I found a gem called ruby-progressbar. However, I couldn't integrate it to my code base even though I followed the instructions documented on the official site. Here's what I have done so far:
First, I thought of using progress_proc. It was a bad idea because progress_proc returns chunked partial of the data.
Second, I streamed the data and built an idea on calculating chunked data. It worked well actually, but it smells bad to me.
Plus, here is the small part of my code base. I hope it helps you understand the concept.
progressbar = ProgressBar.create(title: 'File 1')
Down.download(url, progress_proc: ->(progress) { progressbar.progress = progress }) # It doesn't work
progressbar = ProgressBar.create(title: 'File 1')
file = Down.open(url, progress_proc: ->(progress) { progressbar.progress = progress })
chunked = 0
loop do
break if file.eof?
file.read(1024)
chunked += 1024
progressbar.progress = (chunked / file.size) * 100
end
# This worked well as I remember. It can be faulty because I wrote it down without testing.
In the HTTP protocol, there are two different ways on how a client can determine the full length of a response:
In the most common case, the entire response is sent by the server in one go. Here, the length of the response body in bytes is set in the Content-Length header of the response. Thus, if the response is not chunked, you can get the value of this header and read the response in one go as it is sent by the server.
The second option is for the server to send a chunked response. Here, the server sends chunks of the entire response, one after another. Each chunk is prefixed with the length of the chunk. However, the client has no way to know how many chunks there are in total, nor how large the total response may be. Often, this is even unknown to the server as the first chunks are already sent before the entire response is available to the server.
The down gem follows these two approaches by offering two interfaces:
In the first case (i.e. if the content length of the entire response is known), the gem will call the given content_length_proc once.
In the second case, as the entire length of the response is unknown before it was received in total, the down gem calls the progress_proc once for each chunk received. In this case, it is up to you to show something useful. In general, you can NOT show a progress bar as a percentage of completion here.

Is it appropriate to use HTTP status codes for non-HTTP errors?

I know someone who is writing an API, and wants to use HTTP status codes to report the outcome of queries. e.g. if the user calls example.com/api/product_info?product_id=X, and the product doesn't exist, it would return HTTP status 400: Bad Request. I think that, since this is a valid call (i.e. the actual HTTP request is not malformed), it should return a 200 code response, and just have the body of the response something like {status: 'error'; message: 'No such product'}.
So my question is,
1) Is it appropriate to use HTTP status codes to convey non-HTTP program state, as in the example above?
2) Is there some standard, or at least widely used, specification describing when HTTP status codes are appropriate for use?
I was actually just talking about this the other day - http://blogs.mulesoft.org/api-best-practices-response-handling/
Your status code should reflect the response of the API, as 200 is "OK" and should be used for data that is successfully returned. However, 201 should be used for created items.
As mentioned already, in the event where the user tries a call but it fails (ie: users/?id=5) the server could return back a 400 to inform the user that it was a Bad Request, or a 404 if the resource doesn't exist.
It also depends on the action - if they are searching for a user and there are no responses, I wouldn't return an error, just a 200 with no results found. However, if they are trying to do a PUT or PATCH on a user that doesn't exist I would tell them with an error- as chances are there's a problem within their application somewhere.
In the link posted above you'll find more status codes, but one of the biggest advantages to using status codes is that it informs the client just though the header what actually happened with the server. This allows them to do a relatively quick (and low memory) check instead of having to deserialize the body and loop through an array looking for an errors key.
Essentially, you're giving them the tools to quickly and easily understand what is happening- something that I think every (sane) developer appreciates.
Hope this helps!
- Mike

Need Help Understanding X11 Protocol Errors

I've just started building a minimal X server for Windows from scratch. As I work through it I'm sure I'll run into all kinds of errors and glitches as I work the bugs out and learn more about the protocol.
Here's an example of an error I have seen printed by a client:
X Error of failed request: 0
Major opcode of failed request: 0 ()
Serial number of failed request: 0
Current serial number in output stream: 3
The major opcode meaning seems pretty obvious, but where are the "X Error" codes defined?
What are the serial numbers of the failed request and output stream? Are these supposed to match each other? By output stream, does that mean what was sent to the xserver or what was sent to the xclient? Is this related to sequence numbers?
grep the source...
in libX11, XlibInt.c, _XPrintDefaultError() you can find this error message.
Most of what's printed is from the error event, which is presumably sent by your server.
The current serial is dpy->request which is in Xlibint.h:
unsigned long request; /* sequence number of last request. */
i.e. the last X request that was sent. This may or may not be the same as the request causing the error. (event->serial is supposed to be the request that caused the error, but your server may not have gotten this right)
To hope to code an X server I think you'll be digging into the source code a lot - the docs are not precise or thorough enough... really, you may as well use some of the existing code, the license is liberal enough.
Error codes are defined in the X Protocol specification chapter called Errors.
The other items in an error response are defined in the first chapter Protocol Formats. The actual values and layout of the error messages are found in the Errors section of the Protocol Encoding appendix.
From the contents of that message though it appears you're sending a response filled with zeros when the client isn't expecting a response - most requests to the X server should not get a response sent back through the protocol unless they failed.

Does an HTTP Status code of 0 have any meaning?

It appears that when you make an XMLHttpRequest from a script in a browser, if the browser is set to work offline or if the network cable is pulled out, the request completes with an error and with status = 0. 0 is not listed among permissible HTTP status codes.
What does a status code of 0 mean? Does it mean the same thing across all browsers, and for all HTTP client utilities? Is it part of the HTTP spec or is it part of some other protocol spec? It seems to mean that the HTTP request could not be made at all, perhaps because the server address could not be resolved.
What error message is appropriate to show the user? "Either you are not connected to the internet, or the website is encountering problems, or there might be a typing error in the address"?
I should add to this that I see the behavior in FireFox when set to "Work Offline", but not in Microsoft Internet Explorer when set to "Work Offline". In IE, the user gets a dialog giving the option to go online. FireFox does not notify the user before returning the error.
I am asking this in response to a request to "show a better error message". What Internet Explorer does is good. It tells the user what is causing the problem and gives them the option to fix it. In order to give an equivalent UX with FireFox I need to infer the cause of the problem and inform the user. So what in total can I infer from Status 0? Does it have a universal meaning or does it tell me nothing?
Short Answer
It's not a HTTP response code, but it is documented by WhatWG as a valid value for the status attribute of an XMLHttpRequest or a Fetch response.
Broadly speaking, it is a default value used when there is no real HTTP status code to report and/or an error occurred sending the request or receiving the response. Possible scenarios where this is the case include, but are not limited to:
The request hasn't yet been sent, or was aborted.
The browser is still waiting to receive the response status and headers.
The connection dropped during the request.
The request timed out.
The request encountered an infinite redirect loop.
The browser knows the response status, but you're not allowed to access it due to security restrictions related to the Same-origin Policy.
Long Answer
First, to reiterate: 0 is not a HTTP status code. There's a complete list of them in RFC 7231 Section 6.1, that doesn't include 0, and the intro to section 6 states clearly that
The status-code element is a three-digit integer code
which 0 is not.
However, 0 as a value of the .status attribute of an XMLHttpRequest object is documented, although it's a little tricky to track down all the relevant details. We begin at https://xhr.spec.whatwg.org/#the-status-attribute, documenting the .status attribute, which simply states:
The status attribute must return the response’s status.
That may sound vacuous and tautological, but in reality there is information here! Remember that this documentation is talking here about the .response attribute of an XMLHttpRequest, not a response, so this tells us that the definition of the status on an XHR object is deferred to the definition of a response's status in the Fetch spec.
But what response object? What if we haven't actually received a response yet? The inline link on the word "response" takes us to https://xhr.spec.whatwg.org/#response, which explains:
An XMLHttpRequest has an associated response. Unless stated otherwise it is a network error.
So the response whose status we're getting is by default a network error. And by searching for everywhere the phrase "set response to" is used in the XHR spec, we can see that it's set in five places:
To a network error, when:
the open() method is called, or
the response's body's stream is errored (see the algorithm described in the docs for the send() method)
the timed out flag is set, causing the request error steps to run
the abort() method is called, causing the request error steps to run
To the response produced by sending the request using Fetch, by way of either the Fetch process response task (if the XHR request is asychronous) or the Fetch process response end-of-body task (if the XHR request is synchronous).
Looking in the Fetch standard, we can see that:
A network error is a response whose status is always 0
so we can immediately tell that we'll see a status of 0 on an XHR object in any of the cases where the XHR spec says the response should be set to a network error. (Interestingly, this includes the case where the body's stream gets "errored", which the Fetch spec tells us can happen during parsing the body after having received the status - so in theory I suppose it is possible for an XHR object to have its status set to 200, then encounter an out-of-memory error or something while receiving the body and so change its status back to 0.)
We also note in the Fetch standard that a couple of other response types exist whose status is defined to be 0, whose existence relates to cross-origin requests and the same-origin policy:
An opaque filtered response is a filtered response whose ... status is 0...
An opaque-redirect filtered response is a filtered response whose ... status is 0...
(various other details about these two response types omitted).
But beyond these, there are also many cases where the Fetch algorithm (rather than the XHR spec, which we've already looked at) calls for the browser to return a network error! Indeed, the phrase "return a network error" appears 40 times in the Fetch standard. I will not try to list all 40 here, but I note that they include:
The case where the request's scheme is unrecognised (e.g. trying to send a request to madeupscheme://foobar.com)
The wonderfully vague instruction "When in doubt, return a network error." in the algorithms for handling ftp:// and file:// URLs
Infinite redirects: "If request’s redirect count is twenty, return a network error."
A bunch of CORS-related issues, such as "If httpRequest’s response tainting is not "cors" and the cross-origin resource policy check with request and response returns blocked, then return a network error."
Connection failures: "If connection is failure, return a network error."
In other words: whenever something goes wrong other than getting a real HTTP error status code like a 500 or 400 from the server, you end up with a status attribute of 0 on your XHR object or Fetch response object in the browser. The number of possible specific causes enumerated in spec is vast.
Finally: if you're interested in the history of the spec for some reason, note that this answer was completely rewritten in 2020, and that you may be interested in the previous revision of this answer, which parsed essentially the same conclusions out of the older (and much simpler) W3 spec for XHR, before these were replaced by the more modern and more complicated WhatWG specs this answers refers to.
status 0 appear when an ajax call was cancelled before getting the response by refreshing the page or requesting a URL that is unreachable.
this status is not documented but exist over ajax and makeRequest call's from gadget.io.
Know it's an old post. But these issues still exist.
Here are some of my findings on the subject, grossly explained.
"Status" 0 means one of 3 things, as per the XMLHttpRequest spec:
dns name resolution failed (that's for instance when network plug is pulled out)
server did not answer (a.k.a. unreachable or unresponding)
request was aborted because of a CORS issue (abortion is performed by the user-agent and follows a failing OPTIONS pre-flight).
If you want to go further, dive deep into the inners of XMLHttpRequest. I suggest reading the ready-state update sequence ([0,1,2,3,4] is the normal sequence, [0,1,4] corresponds to status 0, [0,1,2,4] means no content sent which may be an error or not). You may also want to attach listeners to the xhr (onreadystatechange, onabort, onerror, ontimeout) to figure out details.
From the spec (XHR Living spec):
const unsigned short UNSENT = 0;
const unsigned short OPENED = 1;
const unsigned short HEADERS_RECEIVED = 2;
const unsigned short LOADING = 3;
const unsigned short DONE = 4;
from documentation http://www.w3.org/TR/XMLHttpRequest/#the-status-attribute
means a request was cancelled before going anywhere
Since iOS 9, you need to add "App Transport Security Settings" to your info.plist file and allow "Allow Arbitrary Loads" before making request to non-secure HTTP web service. I had this issue in one of my app.
Yes, some how the ajax call aborted. The cause may be following.
Before completion of ajax request, user navigated to other page.
Ajax request have timeout.
Server is not able to return any response.

Ruby Large HTML emails getting error, limit to header size

def mailTo(subject,msg,folks)
begin
Net::SMTP.start('localhost', 25) do |smtp|
smtp.send_message "MIME-Version: 1.0\nContent-type: text/html\nSubject: #{subject}\n#{msg}\n#{DateTime.now}\n", 'person#domain.com', folks
end
rescue => e
puts "Emailing Sending Error - #{e}"
end
end
when the HTML is VERY large I get this exception
Emailing Sending Error - 552 5.6.0 Headers too large (32768 max)
how can i get a larger html above max to work with Net::SMTP in Ruby
This might not be a restriction imposed by the library, but rather a restriction imposed by the service you are using to send. It kinda depends on just how huge of an HTML file we're talking about here, but your mail server may simply not let you send things that large. This probably can not be addressed with simple programming; you're gonna have to come up with a creative solution, like sending through a different service or breaking up the message.
I believe that this is a problem with SMTP and sending that email/message. Try reducing the number of people you send a message to at one time. For example, if you are sending a message to 500 people at once, then maybe send the message to 50 different people at a time instead (sending the message ten times).
2 quick observations:
"552 5.6.0 Headers too large"
this is a SMTP error message. It's coming from your SMTP server, not your code. Your code is just bubbling it up.
Headers are supposed to be seperated by "\r\n", not just "\n". Try fixing that part of your code.
I ran into this issue today. I resolved it by adding body tags to the HTML email.
Without those everything was going into the header.
MIME-Version: 1.0
Content-type: text/html
Subject: Nifty Report
<body>
<h1>some junk</h1>
</body>

Resources