What's an appropriate HTTP status code to return by a REST API service for a validation failure? - validation

I'm currently returning 401 Unauthorized whenever I encounter a validation failure in my Django/Piston based REST API application.
Having had a look at the HTTP Status Code Registry
I'm not convinced that this is an appropriate code for a validation failure, what do y'all recommend?
400 Bad Request
401 Unauthorized
403 Forbidden
405 Method Not Allowed
406 Not Acceptable
412 Precondition Failed
417 Expectation Failed
422 Unprocessable Entity
424 Failed Dependency
Update: "Validation failure" above means an application level data validation failure, i.e., incorrectly specified datetime, bogus email address etc.

If "validation failure" means that there is some client error in the request, then use HTTP 400 (Bad Request). For instance if the URI is supposed to have an ISO-8601 date and you find that it's in the wrong format or refers to February 31st, then you would return an HTTP 400. Ditto if you expect well-formed XML in an entity body and it fails to parse.
(1/2016): Over the last five years WebDAV's more specific HTTP 422 (Unprocessable Entity) has become a very reasonable alternative to HTTP 400. See for instance its use in JSON API. But do note that HTTP 422 has not made it into HTTP 1.1, RFC-7231.
Richardson and Ruby's RESTful Web Services contains a very helpful appendix on when to use the various HTTP response codes. They say:
400 (“Bad Request”)
Importance: High.
This is the generic client-side error status, used when no other 4xx error code is appropriate. It’s commonly used when the client submits a representation along with a
PUT or POST request, and the representation is in the right format, but it doesn’t make
any sense. (p. 381)
and:
401 (“Unauthorized”)
Importance: High.
The client tried to operate on a protected resource without providing the proper authentication credentials. It may have provided the wrong credentials, or none at all.
The credentials may be a username and password, an API key, or an authentication
token—whatever the service in question is expecting. It’s common for a client to make
a request for a URI and accept a 401 just so it knows what kind of credentials to send
and in what format. [...]

From RFC 4918 (and also documented at http://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml):
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415 (Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions. For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.

A duplicate in the database should be a 409 CONFLICT.
I recommend using 422 UNPROCESSABLE ENTITY for validation errors.
I give a longer explanation of 4xx codes here: http://parker0phil.com/2014/10/16/REST_http_4xx_status_codes_syntax_and_sematics/

Here it is:
rfc2616#section-10.4.1 - 400 Bad Request
The request could not be understood by the server due to malformed
syntax. The client SHOULD NOT repeat the request without modifications.
rfc7231#section-6.5.1 - 6.5.1. 400 Bad Request
The 400 (Bad Request) status code indicates that the server cannot
or will not process the request due to something that is perceived
to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
Refers to malformed (not wellformed) cases!
rfc4918 - 11.2. 422 Unprocessable Entity
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a 415 (Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions.
Conclusion
Rule of thumb: [_]00 covers the most general case and cases that are not covered by designated code.
422 fits best object validation error (precisely my recommendation:)
As for semantically erroneous - Think of something like "This username already exists" validation.
400 is incorrectly used for object validation

I would say technically it might not be an HTTP failure, since the resource was (presumably) validly specified, the user was authenticated, and there was no operational failure (however even the spec does include some reserved codes like 402 Payment Required which aren't strictly speaking HTTP-related either, though it might be advisable to have that at the protocol level so that any device can recognize the condition).
If that's actually the case, I would add a status field to the response with application errors, like
<status><code>4</code><message>Date range is invalid</message></status>

There's a little bit more information about the semantics of these errors in RFC 2616, which documents HTTP 1.1.
Personally, I would probably use 400 Bad Request, but this is just my personal opinion without any factual support.

Here's another interesting scenario to discuss.
What if its an type detection API that for instance accepts as input a reference to some locally stored parquet file, and after reading through some metadata of the blocks that compose the file, may realize that one or more of the block sizes exceed a configured threshold and therefor the server decided the file is not partitioning correctly and refuses to start the type detection process.
This validation is there to protect against one of two (or both) scenarios: (1) Long processing time, bad user experience ; (2) Server application explodes with OutOfMemoryError
What would be an appropriate response in this case?
400 (Bad Request) ? - sort of works, generically.
401 (Unauthorized i.e. Unauthenticated) ? - unrelated.
403 (Forbidden i.e. Unauthorized) ? - some would argue it may be somewhat appropriate in this case -
422 (Unprocessable entity) ? - many older answers mention this as appropriate option for input validation failure. What bothers me about using it in my case is the definition of this response code saying its "due to semantic error" while I couldn't quite understand what semantic error means in that context and whether can we consider this failure indeed as a semantic error failure.
Also the allegedly simple concept of "input" as part of "input validation" can be confusing in cases like this where the physical input provided by the client is only but a pointer, a reference to some entity which is stored in the server, where the actual validation is done on data stored in the server (the parquet file metadata) in conjunction with the action the client tries to trigger (type detection).
413 (PayloadTooLarge)? Going through the different codes I encounter one that may be suitable in my case, one that no one mentioned here so far, that is 413 PayloadTooLarge which I also wonder if it may be suitable or again, not, since its not the actual payload sent in the request that is too large, but the payload of the resource stored in the server.
Which leads me to thinking maybe a 5xx response is more appropriate here.
507 Insufficient Storage ? If we say that "storage" is like "memory" and if we also say that we're failing fast here with a claim that we don't have enough memory (or we may blow out with out of memory trying) to process this job, then maybe 507 can me appropriate. but not really.
My conclusion is that in this type of scenario where the server refused to invoke an action on a resource due to space-time related constraints the most suitable response would be 413 PayloadTooLarge

What exactly do you mean by "validation failure"? What are you validating? Are you referring to something like a syntax error (e.g. malformed XML)?
If that's the case, I'd say 400 Bad Request is probably the right thing, but without knowing what it is you're "validating", it's impossible to say.

if you are validating data and data is not, according to defined rules its better to send 422(Unprocessable Entity)so that sender will understand that he braking the rules what agreed upon.
Bad request is for syntax errors. some of the tools like postman shows syntax errors upfront.

Related

HTTP status code for creating too many resources

If there is a limit on the number of resources created using POST request, what should be the status code?
Let's say, there is a restriction on the number of resources created using POST wherein only 10 resources can be created. The 11th POST request should fail due to the above constraint. What should be the status code?
Should it be 422 with a meaningful message, something along the lines of "Resource count limit reached"? or is there a status code for this?
It really depends on your use-case.
If user is limited in time (let's say 10 per day) but might actually get more credits later automatically, I suggest 429 Too Many Requests as client sent to many requests in one day.
If credits are locked (ie: User only had 10 free credits), I suggest 403 Forbidden as the request is fully understood and processable but the server does deny it due to lack of credits.
Anyway 422 Unprocessable entity is not correct as the request is well formed and server might process it with given credits. Nothing is really missing from the request (from what I understand from your post).
I think that HTTP400 is appropriate, especially if you can provide helpful feedback in the error response. If a user is submitting an invalid payload in the request- it's a bad request. Anything else might get confusing.
Though, HTTP405 (Not Allowed) might be better. If there are no POST more requests accepted by the server for a particular resource that may be more accurate. However it really just depends on the future use of the API.

HTTP GET vs POST for Idempotent Reporting

I'm building a web-based reporting tool that queries but does not change large amounts of data.
In order to verify the reporting query, I am using a form for input validation.
I know the following about HTTP GET:
It should be used for idempotent requests
Repeated requests may be cached by the browser
What about the following situations?
The data being reported changes every minute and must not be cached?
The query string is very large and greater than the 2000 character URL limit?
I know I can easily just use POST and "break the rules", but are there definitive situations in which POST is recommended for idempotent requests?
Also, I'm submitting the form via AJAX and the framework is Python/Django, but I don't think that should change anything.
I think that using POST for this sort situation is acceptable. Citing the HTTP 1.1 RFC
The action performed by the POST method might not result in a
resource that can be identified by a URI. In this case, either 200
(OK) or 204 (No Content) is the appropriate response status,
depending on whether or not the response includes an entity that
describes the result.
In your case a "search result" resource is created on the server which adheres to the HTTP POST request specification. You can either opt to return the result resource as the response or as a separate URI to the just created resource and may be deleted as the result resource is no longer necessary after one minute's time(i.e as you said data changes every one minute).
The data being reported changes every minute
Every time you make a request, it is going to create a new resource based on your above statement.
Additionally you can return 201 status and a URL to retrieve the search result resource but I m not sure if you want this sort of behavior but I just provided as a side note.
Second part of your first question says results must not be cached. Well this is something you configure on the server to return necessary HTTP headers to force intermediary proxies and clients to not cache the result, for example, with If-Modified-Since, Cache-control etc.
Your second question is already answered as you have to use POST request instead of GET request due to the URL character limit.

What HTTP status code should I use for nickname validation?

I am building an API and one of the endpoints is about nickname validation for a company. I read a lot about HTTP status codes and for entity validation 422 seems the best choice. How about one field validation as in my example?
What HTTP status code should I use for nickname validation?
For example it already exists
I think 409 Conflict is an appropriate pick
The 409 (Conflict) status code indicates that the request could not
be completed due to a conflict with the current state of the target
resource. This code is used in situations where the user might be
able to resolve the conflict and resubmit the request. The server
SHOULD generate a payload that includes enough information for a user
to recognize the source of the conflict.
User 1 picked a username, User 2 wants the same, but can't, because it conflicts with User 1's username
or has not allowed chars
For this, 422 Unprocessable Entity as you mentioned seems ok.
The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if request body contains well-formed (i.e., syntactically correct), but semantically erroneous, instructions.
Emphasis mine

Ajax call best practice: logical error through HTTP error

Is there any reason to avoid to send logical error as an HTTP 500 error?
I'm developing a web-app, after an Ajax call I have the default error handler that shows a modal notice with the text of the error.
In some case the error can be logical ('ID not found' or 'The input is not a number' ...), in these cases can I send a HTTP error or it should be reserved only for trasportation/authentication error?
This certainly is possible from a technical point of view, but I would advise against it. Reason is that you confuse a logical error in the request and a processing error in the processing. Those should be kept separate.
Instead I suggest you use http code 406 Not Acceptable. It signals that the request has been received, but will not be processed because "it does not make sense". I'd say that is more suitable.

Is it wrong to return 202 "Accepted" in response to HTTP GET?

I have a set of resources whose representations are lazily created. The computation to construct these representations can take anywhere from a few milliseconds to a few hours, depending on server load, the specific resource, and the phase of the moon.
The first GET request received for the resource starts the computation on the server. If the computation completes within a few seconds, the computed representation is returned. Otherwise, a 202 "Accepted" status code is returned, and the client must poll the resource until the final representation is available.
The reason for this behavior is the following: If a result is available within a few seconds, it needs to be retrieved as soon as possible; otherwise, when it becomes available is not important.
Due to limited memory and the sheer volume of requests, neither NIO nor long polling is an option (i.e. I can't keep nearly enough connections open, nor even can I even fit all of the requests in memory; once "a few seconds" have passed, I persist the excess requests). Likewise, client limitations are such that they cannot handle a completion callback, instead. Finally, note I'm not interested in creating a "factory" resource that one POSTs to, as the extra roundtrips mean we fail the piecewise realtime constraint more than is desired (moreover, it's extra complexity; also, this is a resource that would benefit from caching).
I imagine there is some controversy over returning a 202 "Accepted" status code in response to a GET request, seeing as I've never seen it in practice, and its most intuitive use is in response to unsafe methods, but I've never found anything specifically discouraging it. Moreover, am I not preserving both safety and idempotency?
So, what do folks think about this approach?
EDIT: I should mention this is for a so-called business web API--not for browsers.
If it's for a well-defined and -documented API, 202 sounds exactly right for what's happening.
If it's for the public Internet, I would be too worried about client compatibility. I've seen so many if (status == 200) hard-coded.... In that case, I would return a 200.
Also, the RFC makes no indication that using 202 for a GET request is wrong, while it makes clear distinctions in other code descriptions (e.g. 200).
The request has been accepted for processing, but the processing has not been completed.
We did this for a recent application, a client (custom application, not a browser) POST'ed a query and the server would return 202 with a URI to the "job" being posted - the client would use that URI to poll for the result - this seems to fit nicely with what was being done.
The most important thing here is anyway to document how your service/API works, and what a response of 202 means.
From what I can recall - GET is supposed to return a resource without modifying the server. Maybe activity will be logged or what have you, but the request should be rerunnable with the same result.
POST on the other hand is a request to change the state of something on the server. Insert a record, delete a record, run a job, something like that. 202 would be appropriate for a POST that returned but isn't finished, but not really a GET request.
It's all very puritan and not well practiced in the wild, so you're probably safe by returning 202. GET should return 200. POST can return 200 if it finished or 202 if it's not done.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
In case of a resource that is supposed to have a representation of an entity that is clearly specified by an ID (as opposed to a "factory" resource, as described in the question), I recommend staying with the GET method and, in a situation when the entity/representation is not available because of lazy-creation or any other temporary situation, use the 503 Service Unavailable response code that is more appropriate and was actually designed for situations like this one.
Reasoning for this can be found in the RFCs for HTTP itself (please verify the description of the 503 response code), as well as on numerous other resources.
Please compare with HTTP status code for temporarily unavailable pages. Although that question is about a different use case, it actually relates to the exact same feature of HTTP.

Resources