What is GraphQL?

What is GraphQL?

GraphQL has rapidly become an important tool in the toolsets of full stack, frontend, and backend developers. Companies adopting GraphQL praise its efficiency and ease of use, while individual developers choose it to increase their productivity

This course will teach us how to build a whole application from scratch using GraphQL. We’ll explore how to implement a backend with a database, a React web frontend, authentication, and reactive events, among other things.

We’ll dive into the practical and technical details soon, but first, we just need to lay a little bit of theoretical groundwork.

REST API

Before diving into what GraphQL is and how to use it, we need to understand what our main alternatives are and why it makes sense to explore other options.

Prior to GraphQL, REST was one of the most common software architecture used to implement APIs. In a nutshell, REST is a set of conventions that are used to organize applications’ API. It’s based on two main concepts, which are resources and methods.

A resource is any entity within our system, such as a blog post or a user’s comment. Each resource is represented by a resource identifier, such as books/12 for a book with an ID of 12.

We can interact with a resource by using one of the available methods. The most commonly used methods are:

  • GET: To fetch data.
  • POST: To create a new resource.
  • PUT: To update an existing resource.
  • DELETE: To delete a resource.

Let’s imagine that we’re working on a blog application similar to the online publishing site Medium. In this application, we would have resources such as users, blog posts, and comments. Our applications should be able to create, read, and update these resources.

To represent a single user, we might have a resource called /users/1 that represents a user with ID 1. We could then send an HTTP request like this:

Press + to interact
## Fetch user with ID 1
GET https://blog.com/users/1
## Update user with ID 1
PUT https://blog.com/users/1

We would similarly have resources for blog posts.

Press + to interact
## Get all blog posts
GET https://blog.com/posts
## Get a blog post with ID 1
GET https://blog.com/posts/1
## Get comments for the blog post with ID 4
GET https://blog.com/posts/4/comments

REST APIs have been used widely in the industry. It’s common for applications to expose data and operations using REST APIs for frontend applications and third-party clients.

Issues with REST API

Despite its wide adoption, REST API [presents some significant challenges. To demonstrate, let’s look at how we would build a specific feature for our blogging application. We need to implement a webpage for viewing a single blog post, which should include the blog post’s text, comment section, and additional information like the authors’ names.

What would we need in order to build it? We need to get data about the blog post itself, information about the blog post’s authors, and a list of comments.

First, we would send a request to get data about the blog page we need to display.

Press + to interact
GET /posts/123
{
"id": "123",
"name": "REST vs GraphQL",
"body": "..."
"authorIds": ["1", "2"]
}

This is a good start, but as we can see, we only received the IDs of the two authors of the blog post. We have different REST resources that fetch user information, so we need to make two more requests.

Press + to interact
GET /users/1
{
"id": 1,
"name": "joe"
}
GET /users/2
{
"id": 2,
"name": "peter42"
}

Now we need to get a list of comments, which requires another request.

Press + to interact
GET /posts/123/comments
[
{
"id": 1,
"authorId": 3,
"text": "Good post!"
},
{
"id": 1,
"authorId": 4,
"text": "Could elaborate on..."
},
]

This time, we got the data about the comments, but we still need to fetch information about the user IDs. We need to make as many additional requests as there are unique users in the comments section.

Press + to interact
GET /users/3
GET /users/4
...

In this case, we have to make at least five requests to display a single page. One obvious issue with the example above is the number of requests a client has to make. A user has to wait until an application fetches different objects from the server, which causes problems on a slow web connection.

This issue is known as under-fetching because each request returns less data than a client needs, so our application needs to send more requests to get additional data.

Surprisingly, the second issue is that we might not need all the data returned by the API. A modern application is developed for multiple devices, ranging from smartwatches to desktop applications and smart TVs. Each device has a different screen size and input method (finger, stylus, mouse, and so on) and needs to display different data. For example, on the desktop, we have the luxury of a bigger screen. We might want to display more information about each author, including their short bio and other posts they’ve authored, while on mobile, we only want to display each of the authors’ names.

If our REST API provides a single generic endpoint that returns all available information about each resource, we could be returning significantly more data than a particular client needs. This second issue is called over-fetching because a client is receiving more information than they’re actually going to use.

Our example is relatively simple. For more complex pages, this process can get even more tedious and could require even more API calls.

Common solutions

These problems aren’t new and developers have known about them for a long time. One common solution is called backend for frontend (BFF) pattern, where instead of developing generic REST APIs, we introduce APIs for specific UI features. For example, we could have an endpoint that returns all the information that’s needed to display a single blog post.

Press + to interact
GET /post_page_all_info/123
{
"id": 123,
"name": "REST vs GraphQL",
"body": "...",
"authors": [
{
"id": 1,
"name": "joe"
},
{
"id": 2,
"name": "peter42"
}
],
"comments": [
{
"id": 1,
"author": {
"id": 3,
"name": "alan"
},
"text": "Good post!"
},
...
]
}

This allows us to get all the data with a single request, but if we need to implement a different webpage that requires a different subset of the same data, we have to create a new endpoint.

The other problem is that we still suffer from over-fetching if we have a single endpoint per use case. To handle this issue, we would have to introduce even more specialized endpoints, returning a different subset of the same information specifically for mobile, tablets, laptops, and any other potential user device. That might look something like this:

Press + to interact
## Get a minimal amount of data for a mobile application
GET /post_page_all_info_for_mobile/1
## Get a little bit more data for a mobile with a bigger screen
GET /post_page_all_info_for_big_mobile/1
## Get more data to display it on a tablet
GET /post_page_all_info_for_tablet/1

As you can imagine, if we have even a mildly complex application, supporting each of these different endpoints would become very onerous and, from an organizational standpoint,hard to scale. We would need to have a large team just to maintain these APIs by adding new fields and supporting additional use cases.

This situation is exactly what GraphQL can help with.

A different way of building APIs

It’s finally time to answer the question, “What is GraphQL?”And more importantly, how can it help us build better applications?

The first defining aspect of GraphQL is right in its name. Instead of structuring an API around resources and methods, GraphQL forces us to think about our data as a network of interconnected objects (a graph). Each object is connected to other objects, which are in turn connected to other objects. Each GraphQL request can fetch a subset of this graph. For example, here’s a part of the graph that can look for our blogging application.

Note: There’s a common misconception that using GraphQL requires us to also use a graph database, like Neo4j. This is not necessary because we can use any data source when implementing a GraphQL API.

The way a client queries data using GraphQL API differs from REST. Instead of exposing multiple endpoints, GraphQL only exposes a single endpoint. A client can send all queries to that endpoint.

How does a server know what data to return? Easy! A client tells the server what data it needs by specifying which objects and fields the server needs to return.

GraphQL servers can use any data source, such as a database or a microservice, to process a client’s query.

With GraphQL API, we can accomplish three important tasks:

  • Query data.
  • Update data using mutations.
  • Subscribe to real-time updates from the backend.

We’ll cover all of these later in the course, but first, let’s look at how we can query data from GraphQL API. For example, if we want to get the name of a single post by ID with a list of its authors, our query might look something like this:

Press + to interact
{
post(id: 123) {
id
name
authors {
id
name
}
}
}

We’ll go through this in more detail later, but it’s not very hard to understand this query. A client requests data about a post with ID 123 and specifies that they want a server to return three fields: id, name, and a list of authors for this blog post. For each author object, we require a server to return an ID and a name of an author.

A response to this query might look like this:

Press + to interact
{
"id": 123,
"name": "REST vs GraphQL",
"authors": [
{
"id": 1,
"name": "joe"
},
{
"id": 2,
"name": "peter42"
}
]
}

Notice how this is different from REST APIs. Instead of providing a fixed set of endpoints to the query, GraphQL allows a client the flexibility of constructing a query that fetches exactly what data they needs for a specific use case. This provides more flexibility for developers to tweak their queries for the particular needs of their applications and it eliminates the over-fetching issues that we saw before.

Instead of backend developers defining the shape of data returned from an API, each client application describes exactly what data it needs in each case.

One important thing to notice is that the GraphQL query defines the shape of the response. A response will only include the fields requested by the client.

This is another selling point for GraphQL because a client can be certain that the return data will have a predefined format. That’s very different from REST, where the client has no guarantee about what a result will look like.

More than just a query language

At this point, it might seem like GraphQL’s main benefit is that it allows the implementation of a more efficient API. Although this is an important selling point, this is not the only benefit. The GraphQL community has implemented several tools that make things much easier for frontend developers.

Modern client applications are very complex. They maintain their synched state with the backend, orchestrate API calls, denormalize received data, and much more. Only a fraction of development time is spent building a UI.

This is where GraphQL client libraries shine. They allow applications to fetch data more efficiently and simplify state management and interaction with the API on the client’s side. Later on, we’ll see how just a few lines of GraphQL code can replace swathes of state management code.

A brief history of GraphQL

GraphQL was created in 2012 at Facebook that, like many other companies, was experiencing issues similar to what we’ve described above. These issues became more pronounced when Facebook engineers started to ramp up support for mobile devices and began working on adding a news feed.

Facebook already had existing APIs to query newsfeed data, but its engineers quickly ran into problems using them. A newsfeed frontend had to fetch many related objects from the server, such as timeline posts, user information, comments, etc. When loading a single page, sending multiple requests from a mobile device resulted in unacceptable latencies, sometimes up to 45 seconds.

Facebook engineers tried various approaches to avoid multiple roundtrips and over-fetching. They wanted to have the ability to reduce the number of requests they were sending and even considered using FQL (Facebook Query Language, a query language similar to SQL) to specify what data the client needed. However, this was hard to support and required developers to write massive queries on the frontend.

Eventually, they came up with the idea of a simpler query language that was flexible enough to represent the variety of queries Facebook had to support. It would not be as complex as SQL and would be easier to maintain. This language eventually became GraphQL.

After using GraphQL for several years, Facebook presented it at a React conference in 2015, and open-sourced implementations of a GraphQL client and a GraphQL server, which kickstarted the adoption of GraphQL by other companies.

Facebook was not the only company trying to fix its API problems. Other companies like Coursera and Netflix were also working on developing more flexible query languages. However, they also eventually migrated to GraphQL, along with a plethora of other companies.

GraphQL adoption

Although GraphQL started as a Facebook project, it’s no longer owned by the company. Facebook wrote a GraphQL spec that allowed the creation of an ecosystem of client libraries, server-side libraries, and other various tools. Since 2018, the project has been hosted by the Linux Foundation.

GraphQL was originally presented at a React conference, but now it can be used with pretty much any web framework (Angular, Vue, and others) or on any platform (Android, iOS, and others). It’s also not limited to JavaScript. There are server and client-side libraries for all major programming languages.

GraphQL has been widely adopted in the industry and is used by companies such as Airbnb, Shopify, Lyft, and Pinterest.

Summary

We’ve covered a lot of ground in this chapter, but we can boil it down to a few main takeaways:

  • REST APIs can suffer from under-fetching or over-fetching problems.
  • Most of the common approaches to solve these issues are hard to implement and scale.
  • GraphQL is a new alternative to REST API and solves under-fetching and over-fetching issues.
  • GraphQL allows a client to send custom queries that are fine-tuned for each particular use case.
  • A lot of tools have been developed for GraphQL, including client libraries that make frontside development much easier.