Blog / Guides / How to build an API with the best GraphQL performance

How to build an API with the best GraphQL performance

Published on July 9, 2025

Written by

David Fateh

Software Engineer

Contentful

Subscribe for updates

Build better digital experiences with Contentful updates direct to your inbox.

If you're building a GraphQL API, performance needs to be your main concern. The flexibility of GraphQL can be harnessed to get excellent performance, but without careful planning, it can give you the reverse — slow queries, overloaded databases, and poor UX.

A large part of good GraphQL API design means ensuring that its consumers can fetch data in the most efficient way possible. You'll need to build your API thoughtfully, adding smart caching, optimizing database queries (including setting limits to prevent overly complex queries), and handling filtering and pagination effectively.

In this article, we explain the key principles for improving GraphQL performance so your API is fast and reliable at scale.

GraphQL vs. REST performance

When GraphQL first arrived, it was touted as having better performance than REST — especially for data-heavy applications. But for some developers, these promises didn't match their reality of frustrating slowdowns at scale.

When implemented well, GraphQL is the better performer for data-heavy apps. But a poorly designed GraphQL API can lead to so much complexity and inefficiency that you'll wish you'd stuck with a simple and straightforward REST endpoint.

To get the best performance from your GraphQL API — and avoid performance bottlenecks — you need to focus on a few key areas:

Key focus areas for improving your API's GraphQL performance

Improving your site's performance requires a full-stack effort — you need to consider what improvements you can make across your entire stack. When it comes to GraphQL performance, the areas with the potential for the biggest impact include caching strategies, efficient schema design, scaling your GraphQL architecture, and database-level optimizations. Improving all these areas will go a long way toward improving the responsiveness and scalability of your API.

1. Caching

To be clear, GraphQL itself doesn't actually include any native caching mechanisms — any caching you want to add you need to build yourself. And you definitely should add caching to reduce latency (and the general load on your network or server). There are many ways to add caching or to add improvements to assist with caching, and you'll need to decide which combination is right for you.

Persisted queries

A persisted query is when a GraphQL query is predefined and stored on the server with a unique identifier — typically a standard ID or a hash of the GraphQL query string. The client can then run the query by sending a request with the identifier, rather than having to send the entire query. This reduces network traffic, as the payloads are smaller.

The most basic persisted query implementation is when you, as the API developer, choose to store the most common queries on your server with an ID.

For example, a query like this:

Could be stored on the server like this:

If a client wanted to get the user profile for a user with id "123", instead of sending the full query in an HTTP request like this:

They can now send a request to run the query by its ID. The longer the query, the more time this saves.

This is different to caching, as caching stores the query response, whereas this stores the query then runs it.

It's worth noting that persisted queries aren't part of the core GraphQL specification, but many of the big GraphQL libraries (like Apollo, Relay, or GraphQL Yoga) use them — as does the Contentful GraphQL API.

If you want to give your consumers the flexibility to choose which queries to persist, this is something you can build into your software. Apollo's Automatic Persisted Query framework does this, and many other companies have followed suit. It works by making the query identifier the hash of the query string, which is a unique value but one that can be consistently calculated by the client side. The client can then look a query up by its hash, and if it's not there, it can send another request to store the query on the server.

Client-side caching

Enable the consumers of your GraphQL API to do client-side caching by making its query responses cacheable. You can do this by adding Cache-Control headers to your API's responses, which tells the client whether to cache the response and for how long. You'll need a short expiration time for fast-changing data, and a longer time for static data.

You shouldn't change the Cache-Control headers regularly unless you really need to, as this invalidates previously cached versions, leading to more cache misses.

Server-side caching

You can also cache on the server side by storing GraphQL query responses (in JSON format) in a fast in-memory store like Redis or Memcached, which you then retrieve with a unique key based on the query.

This isn't always necessary if you're going to be doing database caching, but it can be useful for expensive queries, such as those involving more computational effort or third-party API calls.

Database caching

This refers to caching the responses to database queries. These database queries are happening under the hood — GraphQL has resolvers that handle the logic for fetching or computing the data for each field in the schema, and as part of this, they can turn a portion of a GraphQL query into a SQL database query. The database query then returns some raw data as rows, which the resolver then converts into a JSON GraphQL response.

Unlike server-side caching, which caches the JSON response, database caching involves caching the SQL response. However, both types of response can be cached in an in-memory database like Redis. This type of caching is ideal for applications that involve running the exact same query over and over again.

Allow granular caching

Structuring your GraphQL schema to allow individual fields to be requested independently makes granular caching possible, which allows consumers of your API to avoid over- or under-fetching data. This will require significant planning, as you'll need to work out sensible schema boundaries and add extra modular resolver logic, so it's worth weighing the performance benefits against the extra development effort.

Cache warming

Consumers of your GraphQL API get faster results when the item they need is cached; however, after a restart or a cache eviction, the query needs to be rerun, which can be much slower. Cache warming means pre-populating your cache with certain commonly requested responses, keeping that response "warm" and ready to go, even after a restart or eviction.

2. Efficient schema design

To make your GraphQL API performant, you should start with a well-designed schema — if your schema encourages inefficient queries, caching can only do so much to help. Some tips to improve your schema design include:

Pagination

Adding server-side pagination to your GraphQL API means it won't ever suffer from the performance degradation that comes from requesting hundreds (or thousands!) of content items in a single request. Instead, it will allow your consumers to request data in smaller chunks, sending less data over the network per request.

If you have very large datasets (10,000+ records), you'll give your consumers the best performance by offering cursor-based pagination. For smaller datasets, offset pagination is a simpler alternative. It should give good enough performance providing your dataset isn't too large, and it can actually be more intuitive for consumers to use and understand.

Batch multiple queries into a single request

In certain cases, you'll actually get better performance by grouping together multiple related requests into one request. This may not seem intuitive when pagination improves performance by doing the opposite of this — splitting one request into multiple requests — so let's dive into why this works.

GraphQL queries can be nested, and what might seem like one request from the consumer's perspective can actually lead to many more internal requests. For example, let's imagine a blog site that lists a number of blog posts on a page. Each blog post object has a title and excerpt, along with information like the most recent comments, who authored the comments, and any reactions (such as 👍, ♥️, 😭, 😡) to the comments. This data could all be retrieved with one query, like this:

Let’s examine how many queries this could end up becoming under the hood. Consider an example of two blog posts, each with three comments, one author of those comments, and three different reaction types for the comments (like “love” and “hate”). Because of its nested resolvers, GraphQL would convert this into 15 separate requests:

There would be one request for the blogPosts, two for the comments (one per post), six for the authors (one per comment), and six for the reactions (one per comment). Add those all together and you get 15 requests. As the nested levels go deeper, the number of requests increases exponentially — this is known as the “n + 1 problem,” as for every one post with n comments, you need to do n + 1 requests. As you can imagine, this doesn't scale well — increasing the number of blog posts to 20 increases the number of requests to 71!

To avoid this performance bottleneck, you can use batching. This groups together similar requests, turning them into a single query at the resolver level. With batching, you'd have one request for the blogPosts, one for the comments, one for the authors, and one for the reactions, making only four requests in total.

The most common way to implement batching in GraphQL is to use DataLoader, a utility that collects GraphQL resolver calls made within the same request and batches them together.

3. Avoid nesting regularly queried fields

We've just seen how heavily nested queries can become very computationally inefficient. Beyond batching, you can avoid this issue at the schema design level by flattening your schema and having these all as separate queries.

Continuing with the blog site example, imagine a second page on the site that only needs to show a list of blog post titles. The page will need to send a GraphQL query to get the list of blog posts, but this will also include all the associated comments, reactions, and users, as these are nested below the blog posts.

It's possible to flatten your schema to completely avoid nesting and prevent unnecessary resolvers from running. For example, this schema:

…could become completely flattened to look like this:

Or you could choose to only partially flatten it:

Flattening your schema splits your nested queries out into top-level queries, which gives your consumers more control over what and when to load. You'll need to choose which parts of your schema are worth flattening, as, although this gives you more control and better performance, it makes your schema more complex and your API harder to maintain and use.

4. Optimize resolvers

GraphQL resolvers are functions on the server that take the GraphQL query and use it to fetch the data from the database. You can optimize your resolvers by reducing any extraneous logic from them and moving this logic to the database query itself.

For example, the resolver below fetches every blog post from the database and then filters the data in the resolver:

But you can make your GraphQL query much more efficient by removing the resolver filter and just passing the filtering login into the database query itself:

Limit complexity of queries

You can stop consumers from calling overly complex queries by limiting the depth of queries or using libraries like graphql-query-complexity, but you should always inform them why they're being prevented from doing this.

It's also worth setting rate limits on a per-entity basis — to stop DDoS attacks or just prevent consumers from calling the same data over and over.

5. Scaling

To scale your GraphQL APIs, you can modularize them by breaking your schema into smaller ones, each with their own API. Then you can use schema stitching or GraphQL federation to expose them via a unified API, allowing your consumers to query them as if they're one.

You can also spread the load on your servers by load balancing traffic or by sharding your GraphQL services — spreading your data across different instances, such as by region or user ID range. However, be mindful about doing this when also implementing batch queries, as performance issues can arise if your batch queries span multiple shards.

6. Database design

Designing the right database is key to improving your GraphQL performance. Start by choosing between SQL or NoSQL. SQL tends to be best for relational integrity and complex joins; however, you will need to optimize your joins to reduce query time and avoid over-fetching. By contrast, NoSQL is more flexible and is great if you need very high throughput.

Once you've chosen and created your database, set up indexing on the fields that are queried or filtered the most, to speed up how quickly your most common queries can be returned.

GraphQL performance testing

Testing the performance of your GraphQL API will help you identify bottlenecks, maintain good performance, and ensure that the user experience continues to be great as your system scales. There are three main ways to do this:

Load testing: Simulate high volumes of traffic or large numbers of concurrent users to see how your system performs under high load-using tools like k6, Artillery, or Locust. This will help you discover issues like server timeouts, the system running out of memory, or resolver performance worsening under pressure.
Benchmarking: Test your most important or complex queries to find out how fast each is performing. You can then use these metrics as a baseline against which you try to improve.
Live monitoring and tracing: Monitor your production environment with tools like OpenTelemetry, Apollo Studio or DataDog. You can use these to trace your queries and see how they're actually performing in production.

Using Contentful's GraphQL API for high performance content delivery

Contentful is a versatile content platform with easy-to-use REST and GraphQL APIs. You can use it to completely customize your content model, and it's optimized for fast content delivery at scale.

Its GraphQL API supports pagination out of the box, as well as other GraphQL performance benefits like the ability to write complex queries with nested fields and fine-grained data fetching.

Start building

Use your favorite tech stack, language, and framework of your choice.

Subscribe for updates

Build better digital experiences with Contentful updates direct to your inbox.

Back end Front end How-to

Meet the authors

David Fateh

Software Engineer

Contentful

David Fateh is a software engineer with a penchant for web development. He helped build the Contentful App Framework and now works with developers that want to take advantage of it.