Caching API requests in Next.JS

Depending upon your headless CMS, you might find yourself making up to three API calls per page. I wondered if there was a way to avoid this.

Let's have a quick recap of how a catch-all page in Next.JS works. If you're a Next Expert (Nexpert?), feel free to skip this bit.

Slug traps

In Next.JS, you can create a special kind of page using square brackets inside your pages directory, for example [id].js (you can also use square brackets on the name of a folder). This will serve as a catch-all template and will be parsed if the user types in (for example) I-want-to-break-free after your top-level domain. Assuming that there isn't already a file or folder which matches that string, of course.

This string is sometimes called a slug (Shopify prefers the term handle). You can choose whatever name you like within those square brackets, then use this term as a variable inside your React code.

For example, if you had a file called [slug].js inside the root of your pages directory, it would be parsed if your user went to www.yoursite.com/my-mother-made-me-wear-these-shorts. Inside [slug].js you could access the string my-mother-made-me-wear-these-shorts inside the getStaticProps() function.

getStaticProps() receives an argument sent to it by Node which you can unwrap like this:

export async function getStaticProps({ params }) {
  const { slug } = params;
  console.log(slug); // "my-mother-made-me-wear-these-shorts"
}

What you chose to do with this information is up to you.

Next: anatomy

But I'm getting ahead of myself slightly. Let's quickly run through the anatomy of a React component inside the pages directory:

const Component = (props) => {
  // Code which runs before the component is rendered
  return <p>Page JSX goes here</p>;
}

export async function getStaticProps({ params }) {
  // Code which runs on Node which ends up populating the "props" argument sent to the component above
}

export async function getStaticPaths() {
  // Code which runs on Node which generates out the static pages, when the site is built
}

This is the way these functions are usually laid out, but they run in reverse order: getStaticPaths() runs when the site is first build. getStaticProps() runs next and finally your component gets a chance to have a go.

The content journey

So how does content make it from your headless CMS to your React component? Here's one example:

First, getStaticPaths() makes an API request to your CMS and asks for a list of all the pages of a particular type. For example blog, product, FAQ or some kind of basic page.

Your CMS will send back a list of page URLs which getStaticPaths() needs to loop through, in order to know which pages it needs to pre-generate on the server.

Next, getStaticProps() needs to get the page content for the particular page the user is on. All getStaticProps() knows at this point is the URL of the current page. Some headless CMSs will allow you to request the page content just using that string, but others will require the unique ID which represents this content.

If it's the latter, then getStaticProps() needs to make an API request with a list of all the URLs of the current page type, plus their IDs (you might notice that this list is one data-point removed from the last request).

Once we have this information, we can loop through it to find our current page. And once we have that, we can match it to the ID we need.

Finally, we use the ID to make a third API request to get the page content detail for the current page.

That seems ... a bit much

Three API calls per page? Well, not quite. getStaticPaths() only runs when the site is building (which can be configured to happen on a kind of repeat, after a pause). But this was my feeling too - how can we minimise these calls?

Two calls in one

The calls made by getStaticPaths() and getStaticProps() are nearly the same - both get a list of all pages, one with IDs, the other with URLs. Why can't we share this data?

Full stack

Next.JS is a full stack library. That means that some of its code runs on the Node server.

Server-side code and client-side code can both sit inside the same file. But Next.JS understands that anything within getStaticPaths() and getStaticProps() needs to run only on the server.

This is great! It means we can use secret API keys and passwords without worrying about exposing these credentials to the public. However, it also means we can't share information between these two functions via React state or variables, even if they share a file.

It's also worth noting that if getStaticProps() is running on every page, then it's likely that it's making the exact same API request every time, for a set of content which rarely changes. Is there a way to reinvent the cache?

Reinventing the cache

One of the benefits of running server-side code is that us web developers have finally been let loose on the file system. Using fs from Node, we can run riot there and even change existing files. Amazing. We could create a JSON file which contains the result of this API call, plus a timestamp, then pull the data from that instead of the API.

I've created a proof-of-concept for this idea, which you can try out or browse on GitHub.

The world's worse CMS

In order to get this proof-of-concept working, I needed to put together a very simple CMS with an API I could pull data from. Luckily, Next.JS has all of the features needed to achieve this. Next allows us to create our own APIs which run on the Node server. The data for the site is stored in a json file which can be interrogated via two APIs:

  • Get page list, which outputs a list of all the page names, IDs and URLs
  • Get page data which takes an ID argument and supplies the JSON for one particular page

I need to pause here, to remind you what a daft approach this is. It's essentially doing this:

Staircase representing wasted effort
Beautiful Steps #2, 2009, Sabina Lang and Daniel Baumann

You shouldn't take this approach in your own projects because you already have access to the file system. There's no need to create an API to grant you access to something you can just pull from the file system yourself.

Setting a global variable

I wanted to set the time before the cache is refreshed globally, so it could be changed in multiple places at once. I've used the .env file. You can access variables stored here using process.env. then the name of the variable in that file (which will be in capital letters, to emphasis that it's a CONSTANT).

Updating the cache

The code in [slug.js] first loads the cache from disk. Then getStaticPaths() checks the data of the cache against the current date and decides if it can be used or not. If it's too old, it attempts to get a new version from the API.

A flaw in the plan

The problem with this is that by definition, when getStaticPaths() runs, the Node server isn't running. So it's always going to fail. This, however, is purely a result of how I've built this proof-of-concept. In the real world, your headless CMS would have a much better up-time.

This flaw does mean that the code has a fallback: if getStaticPaths() can't reach the API to get a new copy of the json, it uses the old one (but makes sure to not update the date on that file).

Props to props

Next, getStaticProps() does the same thing and has the same issue as getStaticPaths(), so reuses the same cached file. When it attempts to get page specific data, it also fails, which means that placeholder copy ends up on every page. However, as soon as the site hydrates, this is replaced by the correct content.

Finally, the Node server starts correctly and the Page component runs, pulling in the correct copy. When the API is reached correctly for the first time, the cache is updated with any changes to the data, plus a timestamp.

How else could this be used?

It's tempting to write code which would download everything from a headless service, to ensure both uninterrupted service and to remain within API call limits.

Unfortunately, for a service such as Shopify, limits are not calculated based on the number of API calls, but on their complexity.

Unless code ticked away in the background updating the cache slowly and incrementally, this would quickly hit the limit.

Shopify does offer "bulk operations", but these seems to work quite differently from normal API calls - it doesn't sound like they could be generated on-demand, for example.

Where this code could be useful is where an API with a strict limit is called too many times, and the data changes rarely.