You can stitch APIs together thanks to GraphQL. It’s an incredibly useful approach that let’s you have unified, transparent access to multiple GraphQL APIs. This can make it easier to access data that is split across multiple APIs without understanding where exactly it’s located. In part one, we looked at the concept and purpose behind combining APIs. In this sequel, we will delve into a working example of GraphQL schema stitching to appreciate how it all works.
Let’s dive into how we can build the stitching proxy that mashes up two data sources.
The first step is to link up to a remote schema. To create such a schema, we need to create an HTTP link which goes to GitHub’s GraphQL API. Authorization is also taken care of in this step by passing a GitHub token with the necessary access privileges we need for this project. The context manipulates every request before going out to that link and adds the necessary token into the header.
We’ll execute an introspection query via that link which will download the schema from the API and make it available locally to manipulate. Following that we will make the schema executable, which in this case means that any call to a query or type in the schema is deferred to this link and passed along. Starting our server with a local proxy for the GitHub API takes care of authentication.
We need 2 APIs to make a stitched GraphQL one. So in our second step, we’re doing the same thing above but with Contentful. We’re getting the executable schemas for both these APIs and merging them together in its default configuration. GraphQL Tools simply combines everything into one big schema, but intelligently remembers which object belongs to which API and delegates them correctly.
At this point, things exist in each schema (a GitHub schema with data from GitHub and a Contentful equivalent with data from Contentful) without any references between them.
The problem here is that these schemas conflict since they both have the same
type with the name “Repository”. Every type and query in a GraphQL schema has to be unique. With this setup, we will only be able to query one of those APIs instead of both.
The GraphQL specifications include the definition of an Abstract Syntax Tree (AST), which can be used to manipulate the schema programmatically.
There’s more code involved with the Contentful schema. We set it up like we did previously before passing it to
transformSchema. This provides an array of transformations to schema that turn it into a format that works for my use case.
In this case, we rename all types that contain the word
repository and replace them with
repositoryMetadata. We also take care of the top-level queries called root fields. Both APIs have root fields called
repository. We’re also going to go ahead and rename those to
With the above, we still have the exact same schema from GitHub, but we’ve modified the Contentful schema to rename every type and query with the word
repository to become
repositoryMetadata. GraphQL tools is sufficiently advanced to seamlessly pass the data on to the right API, despite the type
repositoryMetadata not existing within Contentful’s GraphQL API.
Time to get stitching: We have the two schemas living in the same API and we can query against them without knowing what the URL is. However, the data isn’t linked and we still need to stitch it up.
To do that, we need to write some new type definitions — extending types of existing schemas (or the one single schema we have now) to build the links between objects. For that, we can use the
extend keyword that allows you to manipulate existing types. In this case, we add a field
contentfulMetadata to GitHub’s repository type and extend Contentful’s
repositoryMetadata type with a field named
gitHubRepository that contains the other object.
We also need to write a resolver to resolve the fields for those objects. Let’s first take the Contentful repository metadata object. We defined it such that the field
gitHubRepository unwrapped object will be resolved by this function. The first item we’re adding is a fragment, which is a small piece of reusable GraphQL query, that should be included whenever we query for the
gitHubRepository field. In this case, the fragment queries for the GitHub name and login, which is necessary because these will be used to ask the GitHub API.
To request for this data, we use
mergeInfo.delegateToSchema to instruct the server that the information sought by the user lives in another schema that we’ve delegated to it (by having queried the repository query with the GitHub name and login of the relevant repository, retrieved by the earlier fragment).
All of this is straightforward but what if the APIs don’t fit so neatly together, the data is not directly available in the format we want, or not a clear query to request for the data we need. This is the exact problem when querying Contentful for data — Contentful has a quirk where we can only ask for a collection of objects, not an individual object by one of its fields.
So let’s define a fragment which states that we need a name and the owner’s log in (which is the organization name) whenever we query for Contentful metadata. Again, we are delegating a schema using a query but as you can see, we are querying for a collection instead of an individual object — this is a problem because our email says Contentful metadata will be an individual object, not a collection of them.
When we delegate to a schema, we can add a new transformation to the query and result that we get from that. This is what we’ve done and
wrapQuery allows me to modify the query for repository metadata connection to add a new level in between, so instead of query directly for the data, we are querying for the items, then data. By doing this, we would also be modifying results to simply return the first entry in the items array.
Now that we have my schemas stitched with the references between the two, there’s more we can do with these tools. We can also infer new data — one of my favorite API endpoints in GitHub’s API is the "read a file from my GitHub repository and return it to the API endpoint". To GraphQL, it’s just another field to query for. But what if we don’t just want to read this data and manipulate it in our API — what if we could take information contained in these files and expose them as fields in my modified GitHub API.
For this use case, we need to know if this repository has a TravisCI YAML file because if it does, we would be able to query TravisCI for the basis of this project. So we’ll expand the type repository that comes from the GitHub GraphQL API, add a new field
hasTravisCi (it will be a boolean and will never be null).
To write a resolver, we first need to have the fragment. We query GitHub object in the master branch for the file
master:.travis.yml which we are supposed to treat like a text block (a UTF-8 encoded string). Then, we take that file and check if it exists.
The main takeaway here is to stop treating your APIs as silos — there are relationships between them. When you start making use of those relationships programmatically, you can seamlessly access data across these APIs to make the life of your developers easier.
To rehash, here are key improvements you stand to benefit from when working with GraphQL and schema stitching:
Repositoryobject from GitHub in the example above), you can also ask for information that another API (Contentful in this instance) holds (such as which team maintains the repository)
Plus you get all the usual benefits of a GraphQL API:
Watch the recording of my GitHub Universe talk about GraphQL schema stitching to see this example in action.