Generate Large dynamic sitemaps in NextJS 🚀

Generate Large dynamic sitemaps in NextJS 🚀

This article will help you a lot if you are looking to generate large and nested sitemaps in NextJS from API response as per Google.

·

4 min read

Sitemaps are the easiest way to communicate with Google. They indicate the URLs that belong to your website and when they update so that Google can easily detect new content and crawl your website more efficiently.

A sitemap is a file where you provide information about the pages, videos, and other files on your site, and the relationships between them. Search engines like Google read this file to crawl your site.

Learn more about Sitemaps from Google search central.

How to Add Sitemaps to a Next.js Project

There are two options:

  • Manual

    If you have a relatively simple and static site, you can manually create a sitemap.xml in the public directory of your project:

 <!-- public/sitemap.xml -->
   <xml version="1.0" encoding="UTF-8">
   <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
     <url>
       <loc>http://www.example.com/foo</loc>
       <lastmod>2021-06-01</lastmod>
     </url>
   </urlset>
   </xml>
  • Dynamic

    We can create a new page inside the pages directory such as pages/sitemap.xml.js. The goal of this page will be to hit our API to get data that will allow us to know the URLs of our dynamic pages. We will then write an XML file as the response for /sitemap.xml

    Here is an example if you could try out for yourself:

      //pages/sitemap.xml.js
      const EXTERNAL_DATA_URL = 'https://jsonplaceholder.typicode.com/posts';
    
      function generateSiteMap(posts) {
        return `<?xml version="1.0" encoding="UTF-8"?>
         <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
           <!--We manually set the two URLs we know already-->
           <url>
             <loc>https://jsonplaceholder.typicode.com</loc>
           </url>
           <url>
             <loc>https://jsonplaceholder.typicode.com/guide</loc>
           </url>
           ${posts
             .map(({ id }) => {
               return `
             <url>
                 <loc>${`${EXTERNAL_DATA_URL}/${id}`}</loc>
             </url>
           `;
             })
             .join('')}
         </urlset>
       `;
      }
    
      function SiteMap() {
        // getServerSideProps will do the heavy lifting
      }
    
      export async function getServerSideProps({ res }) {
        // We make an API call to gather the URLs for our site
        const request = await fetch(EXTERNAL_DATA_URL);
        const posts = await request.json();
    
        // We generate the XML sitemap with the posts data
        const sitemap = generateSiteMap(posts);
    
        res.setHeader('Content-Type', 'text/xml');
        // we send the XML to the browser
        res.write(sitemap);
        res.end();
    
        return {
          props: {},
        };
      }
    
      export default SiteMap;
    

Problem:

What if you have around 100,000+ records in your database and you want to create a sitemap for all those records so that Google can crawl all your products/posts or records?

Solution (Large Dynamic Sitemap):

When sitemaps become large, they are split into 1 sitemap index file that points to multiple sitemap files. Learn more about splitting sitemaps with Google’s documentation.

Let's see how we can break down our sitemap in the following ways.

  • Generate main Sitemap

  • Link Feature Specific Sitemaps in the main Sitemap e.g. (Products/Services)

    • Feature Sitemap Api Call to calculate nested sitemaps based on records/1000 (This means each nested sitemap of feature sitemap will have 1000 records in it).

    • Render Parent Feature sitemap e.g. (Products/Services) along with nested sitemaps listed in it e.g. (sitemap-product-1, sitemap-product-2 ...)

Note: You’ll need new routes to render the sitemap index and each of the sitemap pages. Sitemaps should be at the root level, with clean URLs like /dynamic-sitemap.xml and /dynamic-sitemap-0.xml, /dynamic-sitemap-1.xml, etc. Since Next.js doesn’t let us do dynamic page names like dynamic-sitemap-[page].ts, we can leverage rewrites.

Create the following pages:

/pages
  /dynamic-sitemap
    /index.ts <-- this corresponds to the sitemap index
    /[page].ts <-- this corresponds to an individual sitemap

Then, add the rewrites in the Next.js config:

// next.config.js

/** @type {import('next').NextConfig} */
const config = {
  ...
  rewrites: async () => [
    {
      source: '/dynamic-sitemap.xml',
      destination: '/dynamic-sitemap',
    },
    {
      source: '/dynamic-sitemap-:page.xml',
      destination: '/dynamic-sitemap/:page',
    },
  ],
};

next-sitemap provides two APIs to generate server-side sitemaps:

  • getServerSideSitemapIndex to generate the sitemap index file.

  • getServerSideSitemap to generate a single sitemap file.

For the index file, we just need to pull the amount of sitemap pages that will exist, and pass their URLs to getServerSideSitemapIndex.

// dynamic-sitemap/index.ts
// route rewritten from /dynamic-sitemap.xml

const URLS_PER_SITEMAP = 10000;

export const getServerSideProps: GetServerSideProps = async ctx => {
  // obtain the count hitting an API endpoint or checking the DB
  const count = await fetchCountOfDynamicPages();
  const amountOfSitemapFiles = Math.ceil(count / URLS_PER_SITEMAP);

  const sitemaps = Array(totalSitemaps)
    .fill('')
    .map((v, index) => `${getBaseUrl()}/dynamic-sitemap-${index}.xml`);

  return getServerSideSitemapIndex(ctx, sitemaps);
};

// Default export to prevent Next.js errors
export default function MemorialSitemapIndexPage() {}

For the individual sitemaps, we need to fetch their corresponding page and pass the URLs getServerSideSitemap.

// dynamic-sitemap/[page].ts
// route rewritten from /dynamic-sitemap-[page].xml

const URLS_PER_SITEMAP = 10000;

export const getServerSideProps: GetServerSideProps<
  any,
  { page: string }
> = async ctx => {
  if (!ctx.params?.page || isNaN(Number(ctx.params?.page))) {
    return { notFound: true };
  }
  const page = Number(ctx.params?.page);

  // this would load the items that make dynamic pages
  const response = await fetchDynamicPagesForSitemap({
    page,
    pageSize: URLS_PER_SITEMAP,
  });

  const total = response.data.pageData.total;
  const totalSitemaps = Math.ceil(total / URLS_PER_SITEMAP);

  if (response.data.items.length === 0) {
    return { notFound: true };
  }

  const fields = response.data.items.map(items => ({
    loc: `${getSiteUrl()}/${memorial.slug}`,
    lastmod: items.created_at,
  }));

  return getServerSideSitemap(ctx, fields);
};

// Default export to prevent next.js errors
export default function MemorialSitemapPage() {}

Now, you have your generated nested Sitemap in a very manageable and effective way as per Google suggestions:

  1. Main Sitemap:

  1. Feature Sitemap:

  1. Nested Sitemap of a Feature:

And that's it 😊, Thanks.

Â