An XML sitemap is, quite literally, a roadmap of your website for search engines.
It helps them navigate all the pages, images, and videos on your site that you want to be indexed – ideally as quickly as possible. When you submit your sitemap correctly, it speeds up the crawling process. This way, Google, Bing, and other search engines can discover your content faster.
In this article, we’ll go over how sitemaps work, why they’re essential for SEO, and how you can set yours up in no time after reading this guide.
What Is a Sitemap?
A sitemap is a file that lists all the important content on your website that you want search engines to find. It plays a main role in your SEO toolkit by helping Google and other search engines know which pages to prioritize when crawling your site. This leads to faster indexing and improved visibility on search engine results pages (SERPs).
While XML sitemaps are the most commonly used, there are other types worth mentioning, like media sitemaps, RSS feeds, news sitemaps, and even video and image sitemaps. These serve specific content types and can be quite useful for crawling different formats.
However, for SEO purposes, the XML sitemap is usually the most important because it directly impacts how well search engines index your web pages.
How Do Sitemaps Help SEO?
Search engines like Google and Bing strive to crawl and index your site as quickly as possible, but they can only discover pages that are linked to from your site or other websites. This process can be time-consuming, especially for larger and media-rich sites.
By submitting a sitemap, you essentially give search engines a direct list of pages to explore, making the entire crawling process smoother. This is especially helpful for large websites or new sites that may not have many backlinks yet.
Note that submitting your sitemap is a great first step, but it’s equally important to keep it updated. As you add new content or make changes to your site, regularly updating your sitemap keeps search engines in the loop of the most accurate and current map of your site.
What Does an XML Sitemap Look Like?
An XML sitemap is a list of URLs that looks something like this:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.yoursite.com/</loc>
<lastmod>2024-10-20</lastmod>
</url>
<url>
<loc>https://www.yoursite.com/about</loc>
<lastmod>2024-08-15</lastmod>
</url>
</urlset>
It might look like just a bunch of random code for the non-tech-savvy, but each <url> entry has at least two main components:
- loc: This represents the URL of the sitemap file. It directs search engines to the specific content you want them to index.
- lastmod: This represents the last time the page was modified (this is optional, but can help search engines know when to recrawl a page). In their documentation, Google uses the <lastmod> value if it's “consistently and verifiably (for example, by comparing to the last modification of the page) accurate.”
Now, onto the most pressing question…
Is a Sitemap Necessary?
Technically, no – you can rank on search engines without a sitemap, especially if your internal linking is flawless and your website is relatively “small” (with 100 pages or under). But why take the chance? Sitemaps make sure that all your important pages are found, especially if your site is large or new.
That said, even if a page is in your sitemap, there’s no guarantee that it’ll be indexed. There are a few reasons why that could happen:
- Crawl budget: Google has a certain amount of resources it allocates to each site. For larger sites, this means that not every single page may get crawled if the budget runs out.
- Page quality: If your page is thin on quality content, poorly written, or simply doesn't answer users' search queries, Google might decide to skip over it altogether.
- Duplicate content: If you have pages that are too similar or just repeat the same information, Google will typically choose one version of the content to index and show in search results. It might not be the one you want it to be.
- Noindex tags: If you’ve added a “noindex” tag in the HTML of a page, you’re telling search engines to keep that page out of their index, even if it’s listed in the sitemap.
- Changes in algorithms: Google’s algorithms are always changing, and these updates can impact how and when pages get crawled and indexed.
Google will try to crawl a sitemap the moment you submit it. If it can't fetch or read your sitemap, it'll keep trying for a few days. But if the issue keeps happening, Google will eventually give up and stop trying to crawl that specific URL.
Submitting Your Sitemap to Search Engines
To make sure your website gets indexed faster, submit your sitemap to Google Search Console and Bing Webmaster Tools. These tools give you useful insights – like which pages are successfully indexed and if there are any issues stopping certain pages from showing up in search results.
Submitting a Sitemap to Google Search Console
First, make sure you have owner permissions for your website property in Google Search Console. If you don’t, no worries, you can add your sitemap URL to your robots.txt file instead.
You can use an XML sitemap generator to create your sitemap if your CMS or any plugins (in the case of WordPress) don’t auto-generate it. Similarly, depending on your CMS, your sitemap might be automatically posted.
If it’s not, place it at the root of your site (like yoursite.com/sitemap.xml).
You’ll also need to test if Google can access it. Use Google Search Console’s URL inspection tool to check if the page fetch is successful.
Once it’s confirmed that everything is good to go, head to the Sitemaps report, paste your sitemap URL into the "Add a new sitemap" box, and hit Submit.
Google should fetch the sitemap right away, though it may take some time to crawl everything listed, depending on your site’s size and activity.
How to Check if Your Sitemap Is Crawled
On the main Sitemaps page in Google Search Console, you can check the status of all the sitemaps you’ve submitted – whether through the report or API. Each sitemap will show one of the following statuses based on Google's last attempt to process it:
- Success: The sitemap was fetched and processed without any issues.
- Couldn't fetch: Google tried but couldn’t retrieve the sitemap.
-
Sitemap had errors: Google fetched the sitemap but ran into some errors while reading it.
To get more details on a specific sitemap, just click on it. If it says Sitemap could not be read, it means Google wasn’t able to fetch the file. The error section will give you more specifics on what went wrong, so you can address it and resubmit the sitemap.
If the message says Sitemap can be read, but has errors, you’ll see a list of those errors on the details page. Expand each one to get more info on the problem and how to fix it. Once the issues are resolved, you can resubmit the sitemap for Google to process again.
While the Sitemaps page is useful, it doesn't provide the number of indexed pages directly. It shows the number of URLs you’ve submitted through the sitemap(s) and how many of those URLs Google has discovered, but it doesn't display the precise count of indexed pages.
That’s what the Coverage Report in GSC is for. It shows exactly how many of your Google pages are indexed. Plus, this report gives you a clear view of which pages Google recognizes as “accessible” on your site and lets you see which pages are indexed, excluded, or facing issues.
To get to that report, click the dots next to the sitemap, and click on See page indexing.
Next, you'll see how many of the pages are indexed, and which are still waiting for indexation or for which Google encountered an issue.
Submitting a Sitemap to Bing Webmaster Tools
For Bing, the process is quite similar:
- Go to Bing Webmaster Tools.
- Add your site and verify ownership using one of the available methods.
- Once verified, navigate to the Sitemaps section and submit the sitemap URL just as you would in the Google Search Console.
Bing will process the sitemap and provide similar insights into page indexing and any potential issues.
How Do I Create a Sitemap?
If you’re using a CMS like WordPress, Wix, or Shopify, generating a sitemap is simple. For WordPress users, the Yoast SEO plugin automatically generates a sitemap for you at sitemap_index.xml. Similarly, Shopify and Wix also create sitemaps without needing any extra configuration.
If your website runs on a custom-built CMS, it’s important that your sitemap automatically updates whenever new content is added.
What Should You Include in Your Sitemap?
Only include URLs you want search engines to index. This typically includes the canonical version of a page (the main URL, not any sorted or filtered versions). For example:
- Include: https://www.yoursite.com/products
-
Exclude: https://www.yoursite.com/products?sort=price
You also want to avoid including any pages set to "noindex" or those blocked by robots.txt. If you include them in your sitemap, it sends mixed messages – essentially inviting search engines to crawl pages you've already instructed them to ignore. This can waste the crawl budget and lead to indexing issues.
Handling Large Sitemaps
If you have a lot of content, like a large e-commerce site, Google allows a sitemap to contain up to 50,000 URLs and be no larger than 50MB. If your site exceeds that, you can create multiple sitemaps and include them all in a "sitemap index" file. For example:
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://www.yoursite.com/product-sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://www.yoursite.com/blog-sitemap.xml</loc>
</sitemap>
</sitemapindex>
It can be useful to split your sitemap by content type, especially if you have different types of content like products, pages, and a blog.
For example, if you run an online store, you could create a separate sitemap for products, one for pages, and another for your blog, then combine them in a sitemap index.
Even if your sitemap isn't huge, having them organized this way makes it easier to track how well each type of content is being indexed in Google Search Console. Many platforms like Wix and WordPress automatically handle this for you.
International Sitemaps and hreflang Tags
If you run a multilingual website, you can also use your sitemap to specify language variations with hreflang tags. For example, if you have a Spanish and Dutch version of your site, you can include both in your sitemap like this:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.yoursite.com/</loc>
<xhtml:link rel="alternate" hreflang="es" href="https://www.yoursite.com/es/"/>
<xhtml:link rel="alternate" hreflang="nl" href="https://www.yoursite.com/nl/"/>
</url>
</urlset>
Or, if you can’t access the source code or if your developers aren’t available to implement hreflang tags, using sitemaps is a great alternative. You can include hreflang annotations directly in your sitemap, which allows you to indicate language and regional targeting for your pages. Here’s how it looks:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>https://www.siteguru.co/</loc>
<xhtml:link rel="alternate" hreflang="es" href="https://www.siteguru.co/es" />
<xhtml:link rel="alternate" hreflang="nl" href="https://www.siteguru.co/nl" />
<xhtml:link rel="alternate" hreflang="en" href="https://www.siteguru.co/" />
</url>
</urlset>
In the above example, we specify the available language variants for a particular page. This works whether your language variants are on subdomains or completely different domains.
For example, if your Spanish version is on a subdomain and the Dutch version is on a separate domain, your hreflang tags might look like this:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>https://www.webshop.com</loc>
<xhtml:link rel="alternate" hreflang="es" href="https://es.webshop.com/es" />
<xhtml:link rel="alternate" hreflang="nl" href="https://www.webshop.nl/" />
<xhtml:link rel="alternate" hreflang="en" href="https://www.siteguru.co/" />
</url>
</urlset>
In this case, each xhtml:link element defines an alternate URL for different language variants of the page. This guides users to the correct version based on their language and regional settings.
If you’re a SiteGuru user, the Hreflang report (Technical > Hreflangs) shows you which pages have hreflang tags, and helps you quickly identify missing tags (if you’re targeting multiple languages with your website).
Sitemaps for Images and Videos
While most of our focus has been on pages, it's important to note that images and videos can ( and should!) also be included in a sitemap. Getting your images indexed quickly can improve your rankings in Google Image Search.
There are two ways to do this:
Option 1: Include all images on a page within the <url> tag for that page, like this:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"><url> <loc>https://www.siteguru.co/</loc>
<url>
<loc>https://www.siteguru.co/</loc>
<lastmod>2022-05-15</lastmod>
<image:image>
<image:loc>https://www.siteguru.co/logo.png</image:loc>
<image:title>SiteGuru Logo</image:title>
</image:image>
<image:image>
<image:loc>https://www.siteguru.co/team.jpeg</image:loc>
<image:title>Our Team</image:title>
</image:image>
</url>
</urlset>
Option 2: Create a separate sitemap file just for images, like this:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<image:image>
<image:loc>https://www.siteguru.co/logo.png</image:loc>
<image:title>SiteGuru Logo</image:title>
</image:image>
<image:image>
<image:loc>https://www.siteguru.co/team.jpeg</image:loc>
<image:title>Our Team</image:title>
</image:image>
</urlset>
Both methods are effective, so choose the one that best suits your site's structure and content.
SiteGuru's Sitemap Report
With SiteGuru’s Sitemap Report, you’ll always be in the know about any issues with your sitemaps, including:
- Missing pages: It’ll flag any important pages that are missing from your sitemap, so you can add them to GSC right then and there.
- Broken links: If you have pages in your sitemap that aren’t working anymore, SiteGuru will point those out to you.
- Redirected pages: It also helps you find any pages that have been redirected, ensuring your sitemap stays up to date.
All you have to do is go to your SiteGuru dashboard, then Technical > Sitemaps. With this report, you’ll have a clearer picture of your sitemap’s health, making sure all the right pages get crawled and indexed. Plus, SiteGuru lets you download your full sitemap easily, so you can review and update it whenever you need.
Simplify SEO Beyond Sitemaps with SiteGuru
Whether you're new to SEO or just short on time, SiteGuru takes the complexity out of optimization.
It gives you access to a prioritized SEO to-do list, as well as simple yet powerful reports to meet all of your SEO goals. Plus, you’ll receive weekly performance updates and step-by-step optimization tips, so you tackle SEO without feeling overwhelmed. Try it free for 14 days!