A Pattern for Apigee Error Handling
Consistent error handling is an important issue when it comes to REST API design. If HTTP is used as transport protocol, it is obvious to stick to specified semantics like using proper HTTP status codes in responses. Especially in error situations it can make sense to deliver additional information in the response payload. This blog post will have a closer look on a pattern which describes how providing such information can be enforced when Apigee is used as API gateway.
The ideas described here are based on a community forum post which was enhanced in another one. The pattern presented here is what I would consider a practical solution to realize consistent error representation including some more, valuable details in the payload. Everything described here makes use of standard Apigee mechanisms.
The blog post describes the topic in detail. It starts with a general introduction related to architecture and API design with strong focus on error representation. Afterwards, it describes how error handling can be realized on different maturity levels and uses a sample which is implemented in a GitHub repository. Finally, the pattern is described in a nutshell using some code snippets.
Apigee as Part of the System Architecture
This blog post bridges the gap between conceptional thoughts related to error handling in REST APIs and the concrete implementation using Apigee, which is a widely used product in the area of API gateways.
In fact an API gateway like Apigee is not really the implementation of an API, but rather a platform which is located between the client using tha API and the backend system providing the implementation of the API. One of its purposes is to handle cross-cutting concerns like security, monetization and throttling by exposing proxies for the backend system.
Such proxies are good candidates to enforce consistent error handling and a common error data format without the necessity to change the actual backend system. That's the reason why it is worth to spend some thoughts on how this can be achieved with Apigee.
API Design
For better understanding, the blog post is based on a sample use case.
The REST API exposed on Apigee provides a single resource for news which is available on /news
.
It supports to get a news entry by its ID sending a GET
request to a path following the schema /news/{{id}}
Such a news entry consists of a title and the content.
The Accept
header specified in the request defines, which format to use for the data returned.
In case of application/json
, the data is provided as JSON like shown below.
{
"title": "A Pattern for Apigee Error Handling",
"content": "Consistent error handling is an important issue when it comes to REST API design..."
}
If the client sets the Accept
header to application/xml
, the response data looks similar to the snippet shown below.
<news-entry>
<title>A Pattern for Apigee Error Handling</title>
<content>Consistent error handling is an important issue when it comes to REST API design...</content>
</news-entry>
The status code provided in the response must be compliant with the HTTP specification. Pages like the Mozilla documentation provide helpful overview tables. Additionally, a proper, human readable but short reason phrase is returned in conjunction with the status code. This applies to all responses in all situations and should be the absolute minimum when designing a REST API.
Talking about the happy path, the status code 200
should be returned together with the reason phrase OK
.
In this case, the response payload contains the news entry data in the desired format.
Besides this success scenario, there are several error situations which can occur and should be considered.
- The requested news entry does not exist.
Therefore
404
is returned as status code. The reason phrase isNews Entry Not Found
. - The
Accept
header is not specified in the request. Therefore406
is returned as status code. The reason phrase isMissing Accept Header
. - The
Accept
header is set to something else thanapplication/json
orapplication/xml
. Therefore406
is returned as status code. The reason phrase isUnsupported Accept Header
.
Access to the news should only be possible if the user is authorized properly using HTTP basic authentication, i.e. the relevant data is contained in the Authorization
header of the request.
In addition to the already listed ones, there are the following error cases:
- The
Authorization
header is not set. Therefore401
is returned as status code. The reason phrase isUnauthorized
. - Checking the value of the
Authorization
turns out, that the user is not permitted to access the requested news entry. Therefore403
is returned as status code. The reason phrase isForbidden
.
Sample Repository
A sample repository with an API proxy implementing the pattern described here is available on GitHub. To try it out, the following requirements need to be fulfilled.
- The sample can be deployed even on an Apigee trial account. There is a sign up form available to get such a trial account within minutes.
- Node.js and npm need to be present on the machine used to deploy the sample to Apigee. If this is not already available, it can be installed by following this installation guide.
To deploy the sample it needs to be cloned to modify some settings related to the Apigee environment to use.
- Copy the
.env.example
file to.env
. The copied file must not be committed since it will contain credentials. - Set proper values for the keys specified there.
To install the necessary dependencies for the deployment, the command below needs to be executed.
npm install
Once these preparation steps are finished, the deployment can be triggered with the command below.
npm run deploy
The deployment process is based on the apigeetool npm package.
Necessary deployment steps are implemented in the file scripts/deploy.ts
.
Besides the API proxy code, the repository contains files which can be used to call the provided API using the built-in HTTP client of Intellij Idea.
- The
calls/rest-client.rest
file defines the calls. - The calls use some variables which are specific for the Apigee environment.
To specify proper values, the file
calls/rest-client.env.json.example
needs to be copied tocalls/rest-client.env.json
. In the copied file it is necessary to fill in the desired values.
For the sake of simplicity, no backend system is connected. The functionality described earlier is implemented as dummy proxy, which returns static data. This is sufficient for the purpose of this blog post.
For the same reason, the authorization check is simulated.
The user is treated as authorized only if the username dummy
and the password letmein
are used for HTTP basic authentication.
API Proxy Structure
A request handled by an API proxy on Apigee is processed in several flows. The steps involved in handling a request are visualized in the diagram shown below. It is based on a similar diagram shown in the Apigee documentation.
The PreFlow
of the ProxyEndpoint
is a good place for common logic like authorization and desired response format checks.
The request to retrieve a news entry by its ID can be implemented using a ConditionalFlow
inside the ProxyEndpoint
section.
It should be executed whenever a GET
request for /news/*
is received.
The sample makes use of this ConditionalFlow
to prepare the static dummy data returned in the response.
Apigee Error Handling
Apigee provides general documentation related to error handling.
Basically the RaiseFault
policy is used to indicate that an error occurred.
This can happen either automatically by executed policies or manually, e.g. if a conditional step triggers a RaiseFault
policy.
If an error occurs, the execution of the regular flow is stopped.
An error flow starts and checks, whether the API proxy defines a FaultRule
which describes how to handle the particular error situation.
Such steps should follow the goal of providing a proper error response to the client.
If no FaultRule
is executed, Apigee considers the DefaultFaultRule
which can be defined on the API proxy level.
It acts as a kind of general fallback logic, which gets executed if there was no specific FaultRule
registered.
If the alwaysEnforce
attribute of this element is set to true
, the DefaultFaultRule
gets executed always as last step - even if a FaultRule
was already executed.
If neither a specific FaultRule
nor the DefaultFaultRule
was specified, Apigee takes care of the error.
It uses the data of the RaiseFault
policy to derive a default error response.
Following the goal of specifying error responses as part of the API design, it makes sense to explicitly handle errors within the API proxy and not to rely on the Apigee behavior related to unhandled errors.
The next sections will give an overview of how the Apigee error handling mechanisms can be used to enforce consistent error data structures. It starts with a small solution which ensures that proper HTTP status codes and reason phrases are used. This will then be further improved.
HTTP Specific Attributes in Error Response
According to the design of the REST API discussed here, proper HTTP status codes and reason phrases must be used as absolute minimum. This section points out, how this can be achieved in case of Apigee handling the errors.
In case of automatically raised errors, everything should already be specified properly. Documentation related to such errors and their status codes is available in the Apigee Policy Error Reference.
For manually raised errors, it is necessary to take care of the relevant settings in the corresponding RaiseFault
policy XML file.
All possible settings are described in the policy documentation.
The snippet below defines a custom RaiseFault
policy with proper values for the status code and the reason phrase.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<RaiseFault async="false" continueOnError="false" enabled="true" name="RaiseFault.Forbidden">
<DisplayName>Indicate that user is not permitted to access the resource</DisplayName>
<FaultResponse>
<Set>
<StatusCode>403</StatusCode>
<ReasonPhrase>Forbidden</ReasonPhrase>
</Set>
</FaultResponse>
</RaiseFault>
If the flow would now trigger the RaiseFault.Forbidden
policy, Apigee would derive an error response with the status code and reason phrase set accordingly.
An example of a request and the response is shown below.
Please note that Apigee takes care of the response content type according to the request header.
In the sample case it was set to application/json
.
If application/xml
is specified, an equivalent XML representation will be returned.
GET http://<organization>-<environment>.apigee.net/apigee-errorhandling-sample/news/35711
HTTP/1.1 403 Forbidden
Date: Sat, 02 Nov 2019 21:33:05 GMT
Content-Type: application/json
Content-Length: 129
Connection: keep-alive
{
"fault": {
"faultstring": "Raising fault. Fault name : RaiseFault.Forbidden",
"detail": {
"errorcode": "steps.raisefault.RaiseFault"
}
}
}
Response code: 403 (Forbidden); Time: 2260ms; Content length: 129 bytes
This shows that it is possible to reach a minimum of consistent error responses by simply configuring manually triggered RaiseFault
policies properly.
More Details in Payload of Error Response
It could make sense to provide additional error information in the response payload using a custom data structure or one matching a pre-defined schema like RFC-7807.
For manually raised errors, the RaiseFault
policy can set the response payload accordingly.
The sample below shows, how this could look like if a data structure compatible to RFC-7807 is used.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<RaiseFault async="false" continueOnError="false" enabled="true" name="RaiseFault.Forbidden">
<DisplayName>Indicate that user is not permitted to access the resource</DisplayName>
<FaultResponse>
<AssignVariable>
<Name>custom.error.detail</Name>
<Value>You are not allowed to access this resource.</Value>
</AssignVariable>
<Set>
<StatusCode>403</StatusCode>
<ReasonPhrase>Forbidden</ReasonPhrase>
<Payload contentType="application/problem+json" variablePrefix="@" variableSuffix="#">
{
"type": "apigee-errorhandling",
"title": "@message.reason.phrase#",
"status": "@message.status.code#",
"detail": "@custom.error.detail#",
"instance": "@request.path#"
}
</Payload>
</Set>
</FaultResponse>
</RaiseFault>
A Sample request and the corresponding response for this case are shown below. The automatically derived response message contains the error details in the payload like specified in the snippet above.
GET http://<organization>-<environment>.apigee.net/apigee-errorhandling-sample/news/35711
HTTP/1.1 403 Forbidden
Date: Sat, 02 Nov 2019 21:55:13 GMT
Content-Type: application/problem+json
Content-Length: 376
Connection: keep-alive
{
"type": "apigee-errorhandling",
"title": "Forbidden",
"status": "403",
"detail": "You are not allowed to access this resource.",
"instance": "/apigee-errorhandling-sample/news/35711"
}
Response code: 403 (Forbidden); Time: 1887ms; Content length: 376 bytes
This goes further than the minimum described in the previous section and works fine for manually raised errors. Obviously it is not possible that automatically raised errors lead to such a data structure if the error response is automatically derived by Apigee. As a consequence, error responses would not be consistent at this stage and the format would be different for automatically and manually raised errors.
Establish a Common Format for Error Details
The error response shown in the previous section looks pretty good so far. Nevertheless there are some downsides.
- It is not possible to define multiple payload entries for different content types.
Using the snippet above will always return the payload as
application/problem+json
- even if theAccept
header is set toapplication/xml
or another value. TheRaiseFault
policy is therefore specific for a response format. If multiple ones need to be supported, the same number of copies of theRaiseFault
would be necessary. In general this tight coupling between error indication (triggering theRaiseFault
policy) and error representation (deriving the response to send to the client) is a problem. - This only works for custom
RaiseFault
policies. In case of automatically raised errors, it is not possible to influence the representation in the response using the mechanisms discussed so far. This contradicts the goal, because the automatically and manually raised errors would look different from a client point of view.
Both topics can be solved by decoupling error indication and error representation by making use of fault rules.
The first step is to stick to standard HTTP specific attributes and custom variables in the RaiseFault
policy.
Looking at the sample, there was already a variable called custom.error.detail
defined.
The payload data used in the previous section of the blog post needs to be removed.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<RaiseFault async="false" continueOnError="false" enabled="true" name="RaiseFault.Forbidden">
<DisplayName>Indicate that user is not permitted to access the resource</DisplayName>
<FaultResponse>
<AssignVariable>
<Name>custom.error.detail</Name>
<Value>You receive a 'forbidden' error because you set the scenario query parameter to the respective value.</Value>
</AssignVariable>
<Set>
<StatusCode>403</StatusCode>
<ReasonPhrase>Forbidden</ReasonPhrase>
</Set>
</FaultResponse>
</RaiseFault>
The next step is to take care of the conversion to the desired error detail format.
Apigee is not able to do this automatically, because it has no clue about the one to use here.
Therefore some logic is needed, which gets triggered in case of an error and takes care of the conversion.
This is what the DefaulFaultRule
element on the API proxy level is used for.
<DefaultFaultRule name="DefaultFaultRule">
<Step>
<Name>AssignMessage.ConvertErrorToPlaintext</Name>
<Condition>(request.header.accept = null) or ((request.header.accept != "application/json") and (request.header.accept != "application/xml"))</Condition>
</Step>
<Step>
<Name>AssignMessage.ConvertErrorToJson</Name>
<Condition>(request.header.accept = "application/json")</Condition>
</Step>
<Step>
<Name>AssignMessage.ConvertErrorToXml</Name>
<Condition>(request.header.accept = "application/xml")</Condition>
</Step>
<AlwaysEnforce>true</AlwaysEnforce>
</DefaultFaultRule>
The snippet above shows how this could look like.
The most important thing is that <AlwaysEnforce/>
is set to true
.
This means, that the DefaultFaultRule
is always triggered - even if another FaultRule
was already executed.
Since the common error detail format should be enforced, this is necessary.
There are three conditional steps defined.
- One step creates a XML representation of the error if the
Accept
header is set toapplication/xml
. - One step creates a JSON representation of the error if the
Accept
header is set toapplication/json
. - One step creates a plaintext representation of the error if no or an unsupported
Accept
header is set.
The conversion is handled in separate AssignMessage
policies.
As an example, the policy for JSON conversion is shown.
This is more or less similar to the parts which were removed from the RaiseFault
policy earlier.
It picks up the variables and uses them to create the error detail data structure which gets assigned to the response payload.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<AssignMessage name="AssignMessage.ConvertErrorToJson">
<DisplayName>Derive error details for JSON format</DisplayName>
<Set>
<Payload contentType="application/problem+json" variablePrefix="@" variableSuffix="#">
{
"type": "apigee-errorhandling",
"title": "@message.reason.phrase#",
"status": "@message.status.code#",
"detail": "@custom.error.detail#",
"instance": "@request.path#"
}
</Payload>
</Set>
<IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
</AssignMessage>
Decoupling error indication and error representation is solved.
The RaiseFault
policy takes care of defining the HTTP specific attributes and additional variables, which are used to create the error representation in the fault rules of the API proxy.
The good thing is, that this will lead to a minimum acceptable representation of the error even if there is no fault rule for proper conversion registered.
If Apigee picks up the error to automatically derive the response, it will always contain a proper HTTP status code and reason phrase, because they are specified using the standard attributes in the RaiseFault
policy.
The same mechanism applies in case of automatically raised errors.
In this case, the custom.error.detail
variable won't be set.
To cover this case, the IgnoreUnresolvedVariables
should be set true
in the AssignMessage
policy which creates the error representation.
In case of automatically raised errors, the detail message is then empty in the response while all other values are set.
Therefore the goal of enforcing a common error detail data structure is achieved now.
Optionally Enrich Automatically Raised Errors
As previously stated, the detail message is missing in case of automatically raised errors.
To get proper detail messages in such cases, it is possible to introduce additional fault rules on the API proxy level.
The FaultRule
shown below is triggered if a fault with name UnresolvedVariable
occured.
Please note that using the fault name in the condition is an approach which is recommended by Apigee in the documentation.
<FaultRules>
<FaultRule name="EnhanceInternalServerError">
<Step>
<Name>AssignMessage.InternalServerErrorDetail</Name>
</Step>
<Condition>(fault.name = "UnresolvedVariable")</Condition>
</FaultRule>
</FaultRules>
If the condition is true, an AssignMessage
policy is triggered.
Its purpose is to set all necessary custom variables referenced in the error detail data structure.
In case of the sample, it would have to set the custom.error.detail
variable like shown below.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<AssignMessage async="false" continueOnError="false" enabled="true" name="AssignMessage.InternalServerErrorDetail">
<DisplayName>Add details for automatically raised internal server error</DisplayName>
<AssignVariable>
<Name>custom.error.detail</Name>
<Value>Please check to find out why this error occurred.</Value>
</AssignVariable>
</AssignMessage>
Using this, the error details message would be also available for this automatically raised error. The sample below shows a matching request and its response.
GET http://<organization>-<environment>.apigee.net/apigee-errorhandling-sample/news/112
HTTP/1.1 500 Internal Server Error
Date: Sat, 02 Nov 2019 23:32:13 GMT
Content-Type: application/problem+json
Content-Length: 314
Connection: keep-alive
{
"type": "apigee-errorhandling",
"title": "Internal Server Error",
"status": "500",
"detail": "Please check to find out why this error occured.",
"instance": "/apigee-errorhandling-sample/news/112"
}
Response code: 500 (Internal Server Error); Time: 1857ms; Content length: 314 bytes
Reuse Error Conversion Logic
The mechanisms explained so far are limited to a single API proxy, which is not sufficient in many cases. From an API consumer point of view, an API is the API product, which can consist of more than on API proxy. To establish a consistent error representation across multiple proxies, the conversion logic can be put in a shared flow to make it reusable. They can be developed and deployed separately.
In the sample, the shared flow for the error conversion could look like shown below.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<SharedFlow name="default">
<Step>
<Name>AssignMessage.ConvertErrorToPlaintext</Name>
<Condition>(request.header.accept = null) or ((request.header.accept != "application/json") and (request.header.accept != "application/xml"))</Condition>
</Step>
<Step>
<Name>AssignMessage.ConvertErrorToJson</Name>
<Condition>(request.header.accept = "application/json")</Condition>
</Step>
<Step>
<Name>AssignMessage.ConvertErrorToXml</Name>
<Condition>(request.header.accept = "application/xml")</Condition>
</Step>
</SharedFlow>
To integrate the shared flow in a specific proxy, a FlowCallout
policy is necessary.
It simply references the shared flow by its name which could look similar to the snippet shown one below.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<FlowCallout async="false" continueOnError="false" enabled="true" name="FlowCallout.ErrorConversion">
<DisplayName>Delegate error conversion to the shared flow</DisplayName>
<SharedFlowBundle>error-conversion</SharedFlowBundle>
</FlowCallout>
This policy is then triggered in the DefaultFaultRule
in each proxy.
As a result, the shared flow will be triggered as last step whenever an error occured.
<DefaultFaultRule name="DefaultFaultRule">
<Step>
<Name>FlowCallout.ErrorConversion</Name>
</Step>
<AlwaysEnforce>true</AlwaysEnforce>
</DefaultFaultRule>
The result from a client point of view is unchanged. As a good feature in the area of debugging, the steps which are part of a shared flow, are also shown in the trace view. Having a look at the screenshot below, the policies related to error conversion are show and grouped in a box which represents the shared flow.
Ideas for Further Optimization
What is right now available is a reuse solution, which requires that each proxy correctly implements the default fault rule to trigger the conversion. Based on this, the final step to perfection would be a solution, in which this behavior could be automatically enforced without any implementation in the proxy.
An idea to solve this was to check, if fault rules and a default fault rule are also available in shared flows. If this would exist, it would be possible to register the shared flow as a flow hook, which is automatically triggered e.g. before the regular processing starts or after it finished.
It turned out, that such a solution is not possible - at least not using fault rules. A shared flow is simply a reusable sequence of conditional steps, which can be integrated in a proxy. At execution time this looks similar to steps directly added in the proxy, i.e. the proxy is still around the shared flow and therefore responsible for error handling with fault rules. This fits to the fact, that there are no fault rules available for shared flows.
Besides such a not realizable enforced runtime dependency between the proxy and a shared flow, there are some other alternatives. The following list is inspired by a community forum post.
- Include such logic during build time. It would be possible to run a build process which adds a proper default fault rule to each proxy. The necessary steps for the conversion could be added directly or a trigger for a shared flow could be added.
- Provide templates and design patterns. It could make sense to provide such templates to offer a quickstart path for developing new proxies. Such a template could contain the desired error handling implementation. In addition it's often useful to document patterns for different use cases like the one of consistent error data details. The blog post and the sample repository is an example.
The Pattern in a Nutshell
Much explanation was given in this blog post. To make this easier to use, a short overview is given below.
- Error handling is an important topic to consider during REST API design. As a minimum proper HTTP status codes and reason phrases should be used. To further improve this, it could make sense to provide error details in a consistent way.
- For manually triggered errors, the
RaiseFault
policies should define semantically correct HTTP status codes and understandable reason phrases using the standard attributesStatusCode
andReasonPhrase
. Additional, custom attributes for error details should also be added directly in theRaiseFault
policy configuration.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<RaiseFault async="false" continueOnError="false" enabled="true" name="RaiseFault.Forbidden">
<DisplayName>Indicate that user is not permitted to access the resource</DisplayName>
<FaultResponse>
<AssignVariable>
<Name>custom.error.detail</Name>
<Value>You receive a 'forbidden' error because you set the scenario query parameter to the respective value.</Value>
</AssignVariable>
<Set>
<StatusCode>403</StatusCode>
<ReasonPhrase>Forbidden</ReasonPhrase>
</Set>
</FaultResponse>
</RaiseFault>
- For each automatically triggered error, a
FaultRule
should be added in order to provide additional, custom attributes for error details in a dedicatedAssignMessage
policy.
<FaultRules>
<FaultRule name="EnhanceInternalServerError">
<Step>
<Name>AssignMessage.InternalServerErrorDetail</Name>
</Step>
<Condition>(fault.name = "UnresolvedVariable")</Condition>
</FaultRule>
</FaultRules>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<AssignMessage async="false" continueOnError="false" enabled="true" name="AssignMessage.InternalServerErrorDetail">
<DisplayName>Add details for automatically raised internal server error</DisplayName>
<AssignVariable>
<Name>custom.error.detail</Name>
<Value>Please check to find out why this error occurred.</Value>
</AssignVariable>
</AssignMessage>
- A
SharedFlow
with the purpose of error conversion should be provided. It takes care of deriving a proper and consistent error representation.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<SharedFlow name="default">
<Step>
<Name>AssignMessage.ConvertErrorToPlaintext</Name>
<Condition>(request.header.accept = null) or ((request.header.accept != "application/json") and (request.header.accept != "application/xml"))</Condition>
</Step>
<Step>
<Name>AssignMessage.ConvertErrorToJson</Name>
<Condition>(request.header.accept = "application/json")</Condition>
</Step>
<Step>
<Name>AssignMessage.ConvertErrorToXml</Name>
<Condition>(request.header.accept = "application/xml")</Condition>
</Step>
</SharedFlow>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<AssignMessage name="AssignMessage.ConvertErrorToJson">
<DisplayName>Derive error details for JSON format</DisplayName>
<Set>
<Payload contentType="application/problem+json" variablePrefix="@" variableSuffix="#">
{
"type": "apigee-errorhandling",
"title": "@message.reason.phrase#",
"status": "@message.status.code#",
"detail": "@custom.error.detail#",
"instance": "@request.path#"
}
</Payload>
</Set>
<IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
</AssignMessage>
- Each API proxy should define a
DefaultFaultRule
withAlwaysEnforce
set totrue
. It should trigger theSharedFlow
for error conversion. - The easiest way to share this across teams is most likely to provide a template repository including an API proxy with proper and consistent error handling together with a pattern description. If this is not enough, build time inclusion could be worth a look.
Conclusion
This blog post had a closer look at error handling in Apigee API proxies. The mechanisms provided by the platform were explained and used in a way, which makes it possible to reach consistent error handling and representation.
Please feel free to consider the thoughts described here as well as the sample code provided in the GitHub repository.