What is the purpose of using HTTP status codes to describe REST API domain errors?

110 Views Asked by At

For REST APIs I was taught to think in HTTP status codes like

  • 200 => GET, PUT, PATCH, DELETE ( sometimes 204 ) requests
  • 201 => POST requests
  • 400 => request validation failed
  • 401 => user is not logged in
  • 403 => user is lacking permissions
  • 404 => a ressource ( e.g. document ) wasn't found
  • 409 => a domain specific error ocurred ( you can't delete the document because you have to do other things first )
  • 500 => something failed ( e.g. exception was thrown )
  • 501 => endpoint exists but is not implemented yet

Yes, it's quite simple to understand and easy to implement but aren't HTTP status codes made for the actual server? If I can't delete a document because I need to do other things first, shouldn't this be a 200 ( because the request was fulfilled ) with a payload like

{
  "success": false,
  "data": null,
  "error": {
    "code": "ERR_CODE_GOES_HERE",
    "message": "Human readable message goes here."
  }
}

Instead of sending a technical HTTP 409 with a payload like

{
  "code": "ERR_CODE_GOES_HERE",
  "message": "Human readable message goes here."
}

To me it feels like almost every response should have a 200 status code because the actual server was able to fulfill the request but the backend might point to a domain specific problem.

I know this approach is very common and API consumers can simply do if(!response.ok) {...handle error...} instead of parsing the response and inspecting it.

Are there any other reasons in doing so?

2

There are 2 best solutions below

0
Code-Apprentice On

If I can't delete a document because I need to do other things first, shouldn't this be a 200 ( because the request was fulfilled ) with a payload like

The request wasn't fulfilled because nothing was deleted. The whole point of these status codes is to give your users information about the request. 2xx means success, 4xx means the client did something wrong, and 5xx means the server did something wrong. In practice, the user should never see 5xx codes because this means the server has a bug. These should be high priority to avoid and fix. On the other hand 4xx will tell the user that they did something wrong. You can include additional information in the body of the response to help them understand how to fix the problem. This isn't exclusive to 200, so no, not every response should be a 200 status code.

To me it feels like almost every response should have a 200 status code because the actual server was able to fulfill the request but the backend might point to a domain specific problem.

Maybe you mean that a 200 should mean that there was a successful connection to the server. However, this isn't the meaning of "fulfill the request". We only say a request was "fulfilled" if the requested action was successful.

In addition, if there is no connection to the server, you will get an error, but it won't be an HTTP status code. This is because the error occurs in a layer before HTTP has a chance to process the request. I recommend that you read and learn about the OSI Network Model. HTTP is an Application Layer protocol in the OSI model. This sits at the top and there are many other layers that the communcation must go through to even get to HTTP. For example, TCP/IP is the transport layer and your ethernet card is the physical layer. Errors at these levels will cause the request to fail, but won't result in a HTTP status code because the error occurred before HTTP even had a chance to process it. An error with the connection will happen in the appropriate layer and won't have anything to do with HTTP status codes.

0
VoiceOfUnreason On

Yes, it's quite simple to understand and easy to implement but aren't HTTP status codes made for the actual server?

Very close! They aren't there for the server, they are there for other general purpose HTTP components (browsers, spiders, proxies, caches). They communicate what general nature of the response payload (is it a representation of the resource, or a description of an error condition), and also communicate the semantics of some of the fields (ex: 201 tells you how to interpret the Location field, 3xx tells you how to interpret the Retry-After field, and so on).

Status codes are metadata of the transfer-of-documents-over-a-network domain.

To me it feels like almost every response should have a 200 status code because the actual server was able to fulfill the request but the backend might point to a domain specific problem.

Not every response, no -- but responses that take a not-the-happy-path route through the domain logic certainly should be.

Naive example: if you have an API for an online quiz, and a submitted answer is correct, what should the status code be? 200. If the submitted answer is incorrect, what should the status code be? 200.


That said, a lot of people got this wrong, in part because the "cost" of getting it wrong isn't necessarily expensive (or even obvious). And that also means that, at this point, a lot of people have gotten it wrong for a long time: there's a lot of inertia that resists change.

For example: many resource models include a "health-check" resource that can be used to monitor the health of the server itself. And we have systems that we can run which monitor the health-check resource, and fire off alarms if a problem is detected.

The way this should work is that we GET the current representation of the health-check resource, and examine the representation (or the representation metadata) to determine healthy vs unhealthy.

In practice what we have instead is systems that expect the status code of the response to vary with the health of the server. That's inconsistent with the uniform interface constraint, and requires a shared, out-of-band understanding of the resource semantics.

But in most of the cases that we actually care about, it does the job that we want done, so ...?


Given today's context, I think the right answer in most cases is going to involve an architectural decision record to document your understanding of the tradeoffs at the time, so that future you will understand how the current implementation came about.