Imagine you have a GraphQL API, backed by a MongoDB instance. A client issues a simple query like this:

query users {
  name
  tweets {
    message
  }
}

Users and Tweets are stored in separate collections. With a typically implemented Apollo-Server backend, at a high level in pseudocode, it will do the following:

// 'users' resolver
users = fetchAllUsersFromDatabase()

// 'user' resolver
for user in users {
  
  // 'user.name' resolver
  return user.name

  // 'user.tweets' resolver
  tweets = fetchTweetsFromDatabase(userId: user._id)
  return tweets

}

This is a classic N+1 database query problem. Given 10 users, you’ll make 11 database queries.

If we knew ahead-of-time which users we needed to fetch tweets for, we could easily do this in two queries plus a server-side map+filter.

const users = db.users.find({});
const tweets = db.tweets.find({ userId: users.map(u => u._id) })
return _.map(users, (user) => ({ 
  name: user.name,
  tweets: _.filter(tweets, (tweet) => tweet.userId === user._id),
})

Moreover, if we also knew the fields ahead of time, we could save some bytes by doing a selection on the fields in the queries themselves:

const users = db.users.find({}, { _id: 1, name: 1 })
const tweets = db.tweets.find({ userId: users.map(u => u._id) }, { message: 1, userId: 1 });
return _.map(users, (user) => ({ 
  name: user.name,
  tweets: _.filter(tweets, (tweet) => tweet.userId === user._id),
})

This problem compounds with every additional nested edge your types implement on the graph. Imagine Tweets draws an edge to Likes, which draws an edge to Users who liked the Tweet, all stored in separate collections or tables. The number of queries becomes exponential instantly.

1 Users Query --> A Users
  A Tweets Queries --> A*B Likes
    A*B Likes Queries --> A*B*C Users

A * B * C + 1 queries, to do something which could be trivially solved with 4 queries and some collection manipulation on the result. You could even do it in one query less trivially.

Because GraphQL allows users to sculpt queries however they wish, this isn’t just a theoretical problem; its a real security and operations issue that affects every GraphQL API. The common “solution” is to deploy query depth or complexity limiting. This destroys the power of GraphQL, adds unpredictable complexity to the usability of your API, and doesn’t even consistently solve the problem.

Are there any server-side GraphQL frameworks or libraries which make solving this easy? Or even possible? I haven’t seen any. This is either a fundamental problem with GraphQL or an opportunity to improve the server-side GraphQL landscape.