XML Sitemap Problems
What This Means
An XML sitemap is a file that lists all the important pages on your website, helping search engines discover, crawl, and index your content more efficiently. When sitemaps are missing, incorrect, or poorly structured, search engines may miss important pages, waste crawl budget on unimportant URLs, index wrong versions of pages, or struggle to understand your site's structure, leading to reduced search visibility and organic traffic.
How XML Sitemaps Work
Basic Sitemap Structure:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2025-01-15</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/products/widget</loc>
<lastmod>2025-01-10</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
What Each Element Means:
<loc>- Full URL of the page (required)<lastmod>- Last modification date (optional but recommended)<changefreq>- How often page changes (optional, advisory only)<priority>- Relative importance 0.0-1.0 (optional, advisory only)
Impact on Your Business
SEO Consequences:
- Pages not indexed - Search engines can't find important content
- Delayed indexing - New pages take weeks instead of days
- Wasted crawl budget - Bots crawl wrong pages
- Lost organic traffic - Products/articles don't appear in search
- Poor site architecture signals - Google sees disorganized site
Common Sitemap Problems:
- No sitemap exists - Search engines must discover pages organically
- Outdated sitemap - Lists deleted pages, missing new content
- Too large - Over 50,000 URLs or 50MB (Google limit)
- Wrong URLs - 404s, redirects, non-canonical URLs
- Blocked by robots.txt - Sitemap location blocked from crawling
- Not submitted to Search Console - Google doesn't know it exists
Real-World Impact:
- Sites with sitemaps get indexed 50% faster
- E-commerce sites without sitemaps miss up to 30% of product indexing
- News sites need sitemaps to appear in Google News
- Large sites (10,000+ pages) need sitemaps for efficient crawling
How to Diagnose
Method 1: Check Sitemap Exists
- Visit
https://yoursite.com/sitemap.xml - or try
https://yoursite.com/sitemap_index.xml - Check if it loads
What to Look For:
✅ Good sitemap:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/page1</loc>
<lastmod>2025-01-15</lastmod>
</url>
</urlset>
❌ Problems:
404 Not Found - No sitemap exists
or
<urlset>
<url>
<loc>example.com/page</loc> <!-- Missing protocol https:// -->
</url>
</urlset>
Method 2: Google Search Console Sitemaps Report
- Open Google Search Console
- Go to Sitemaps section (left menu)
- Check submitted sitemaps
Check For:
Status: Success ✅
URLs discovered: 1,523
URLs indexed: 1,487
or
Status: Couldn't fetch ❌
Error: 404 Not Found
or
Status: Has errors ⚠️
4 URLs couldn't be indexed
Common Errors:
- "Couldn't fetch" - Sitemap URL doesn't exist or is blocked
- "Sitemap is HTML page" - Not an XML file
- "Parsing error" - Invalid XML syntax
- "Sitemap contains URLs blocked by robots.txt"
- "Sitemap is too large"
Method 3: XML Sitemap Validator
- Visit XML Sitemap Validator
- Enter your sitemap URL
- Click Validate
Check Results:
Method 4: Manual Sitemap Review
Common issues to check:
<!-- ❌ HTTP instead of HTTPS -->
<loc>http://example.com/page</loc>
<!-- ❌ Relative URLs -->
<loc>/page</loc>
<!-- ❌ 404 pages in sitemap -->
<loc>https://example.com/deleted-page</loc>
<!-- ❌ Redirecting URLs -->
<loc>https://example.com/old-url</loc> <!-- Redirects to new-url -->
<!-- ❌ Non-canonical URLs -->
<loc>https://example.com/page?tracking=123</loc>
<!-- Canonical is: https://example.com/page -->
<!-- ❌ Blocked URLs -->
<loc>https://example.com/admin/</loc>
<!-- Blocked in robots.txt -->
<!-- ❌ Duplicate URLs -->
<loc>https://example.com/page</loc>
<loc>https://example.com/page</loc> <!-- Listed twice -->
<!-- ❌ Wrong domain -->
<loc>https://staging.example.com/page</loc>
<!-- On production sitemap -->
Method 5: Check Sitemap Size
- Download sitemap XML file
- Check file size
- Count URLs
Limits:
- Maximum 50,000 URLs per sitemap
- Maximum 50MB uncompressed
- If exceeded, use sitemap index
# Count URLs in sitemap
curl -s https://example.com/sitemap.xml | grep -c "<loc>"
# Check file size
curl -sI https://example.com/sitemap.xml | grep -i content-length
General Fixes
Fix 1: Create Basic XML Sitemap
Simple sitemap structure:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<!-- Homepage -->
<url>
<loc>https://example.com/</loc>
<lastmod>2025-01-15</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<!-- Important pages -->
<url>
<loc>https://example.com/about</loc>
<lastmod>2025-01-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>https://example.com/products</loc>
<lastmod>2025-01-14</lastmod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
</url>
<!-- Product pages -->
<url>
<loc>https://example.com/products/widget</loc>
<lastmod>2025-01-12</lastmod>
<changefreq>weekly</changefreq>
<priority>0.7</priority>
</url>
<!-- Blog posts -->
<url>
<loc>https://example.com/blog/post-title</loc>
<lastmod>2025-01-08</lastmod>
<changefreq>monthly</changefreq>
<priority>0.6</priority>
</url>
</urlset>
Fix 2: Use Sitemap Index for Large Sites
When you have 50,000+ URLs:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-products.xml</loc>
<lastmod>2025-01-15</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-blog.xml</loc>
<lastmod>2025-01-14</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2025-01-10</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-images.xml</loc>
<lastmod>2025-01-12</lastmod>
</sitemap>
</sitemapindex>
Fix 3: Generate Sitemap Dynamically
Node.js/Express example:
const express = require('express');
const { SitemapStream, streamToPromise } = require('sitemap');
app.get('/sitemap.xml', async (req, res) => {
try {
const sitemap = new SitemapStream({ hostname: 'https://example.com' });
// Add homepage
sitemap.write({ url: '/', changefreq: 'daily', priority: 1.0 });
// Add products from database
const products = await getProductsFromDB();
products.forEach(product => {
sitemap.write({
url: `/products/${product.slug}`,
lastmod: product.updatedAt,
changefreq: 'weekly',
priority: 0.8
});
});
// Add blog posts
const posts = await getPostsFromDB();
posts.forEach(post => {
sitemap.write({
url: `/blog/${post.slug}`,
lastmod: post.updatedAt,
changefreq: 'monthly',
priority: 0.6
});
});
sitemap.end();
const xml = await streamToPromise(sitemap);
res.header('Content-Type', 'application/xml');
res.send(xml.toString());
} catch (error) {
res.status(500).send('Error generating sitemap');
}
});
Fix 4: WordPress Sitemap
Use Yoast SEO or RankMath:
// Yoast generates sitemap automatically at:
// https://example.com/sitemap_index.xml
// RankMath generates at:
// https://example.com/sitemap_index.xml
// Or use WordPress core (5.5+):
// Automatically creates at:
// https://example.com/wp-sitemap.xml
Custom WordPress sitemap:
// functions.php
function generate_custom_sitemap() {
$posts = get_posts(['numberposts' => -1]);
$pages = get_pages();
header('Content-Type: application/xml');
echo '<?xml version="1.0" encoding="UTF-8"?>';
echo '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
// Homepage
echo '<url>';
echo '<loc>' . home_url('/') . '</loc>';
echo '<lastmod>' . date('Y-m-d') . '</lastmod>';
echo '<priority>1.0</priority>';
echo '</url>';
// Posts
foreach ($posts as $post) {
echo '<url>';
echo '<loc>' . get_permalink($post) . '</loc>';
echo '<lastmod>' . get_post_modified_time('Y-m-d', false, $post) . '</lastmod>';
echo '<priority>0.6</priority>';
echo '</url>';
}
// Pages
foreach ($pages as $page) {
echo '<url>';
echo '<loc>' . get_permalink($page) . '</loc>';
echo '<lastmod>' . get_post_modified_time('Y-m-d', false, $page) . '</lastmod>';
echo '<priority>0.8</priority>';
echo '</url>';
}
echo '</urlset>';
exit;
}
// Create virtual sitemap
add_action('init', function() {
add_rewrite_rule('^sitemap\.xml$', 'index.php?custom_sitemap=1', 'top');
});
add_filter('query_vars', function($vars) {
$vars[] = 'custom_sitemap';
return $vars;
});
add_action('template_redirect', function() {
if (get_query_var('custom_sitemap')) {
generate_custom_sitemap();
}
});
Fix 5: Shopify Sitemap
Shopify automatic sitemaps:
# Main sitemap index
https://yourstore.myshopify.com/sitemap.xml
# Component sitemaps (auto-generated)
https://yourstore.myshopify.com/sitemap_products_1.xml
https://yourstore.myshopify.com/sitemap_collections_1.xml
https://yourstore.myshopify.com/sitemap_pages_1.xml
https://yourstore.myshopify.com/sitemap_blog_1.xml
Fix 6: Clean Up Sitemap
Remove problematic URLs:
<!-- BEFORE: Messy sitemap -->
<urlset>
<url>
<loc>http://example.com/page</loc> <!-- HTTP -->
</url>
<url>
<loc>https://example.com/admin/</loc> <!-- Blocked -->
</url>
<url>
<loc>https://example.com/page?sort=price</loc> <!-- Parameter -->
</url>
<url>
<loc>https://example.com/deleted</loc> <!-- 404 -->
</url>
<url>
<loc>https://example.com/page</loc> <!-- Canonical -->
</url>
</urlset>
<!-- AFTER: Clean sitemap -->
<urlset>
<url>
<loc>https://example.com/page</loc> <!-- Only canonical URL -->
<lastmod>2025-01-15</lastmod>
</url>
</urlset>
Fix 7: Submit to Search Engines
- Go to Sitemaps section
- Enter sitemap URL:
sitemap.xmlorsitemap_index.xml - Click Submit
- Monitor for errors
Bing Webmaster Tools:
- Go to Sitemaps section
- Click Submit sitemap
- Enter full URL:
https://example.com/sitemap.xml - Click Submit
robots.txt reference:
User-agent: *
Disallow: /admin/
Sitemap: https://example.com/sitemap.xml
Fix 8: Add Images to Sitemap
Image sitemap extension:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://example.com/products/widget</loc>
<image:image>
<image:loc>https://example.com/images/widget-main.jpg</image:loc>
<image:caption>Blue Widget Product Photo</image:caption>
<image:title>Blue Widget</image:title>
</image:image>
<image:image>
<image:loc>https://example.com/images/widget-side.jpg</image:loc>
<image:caption>Blue Widget Side View</image:caption>
</image:image>
</url>
</urlset>
Platform-Specific Guides
Detailed implementation instructions for your specific platform:
| Platform | Troubleshooting Guide |
|---|---|
| Shopify | Shopify Sitemap Guide |
| WordPress | WordPress Sitemap Guide |
| Wix | Wix Sitemap Guide |
| Squarespace | Squarespace Sitemap Guide |
| Webflow | Webflow Sitemap Guide |
Verification
After creating/updating sitemap:
Test 1: Validate XML
- Visit sitemap URL directly
- Browser should display XML
- No syntax errors
- All URLs use https://
Test 2: XML Validator
- Use XML Sitemap Validator
- Enter sitemap URL
- Check for errors
- Verify all URLs return 200
Test 3: Google Search Console
- Submit sitemap
- Wait 24-48 hours
- Check Sitemaps report
- Verify "Success" status
- Check indexed vs discovered ratio
Test 4: Fetch as Google
- URL Inspection tool
- Enter a URL from sitemap
- Should be discoverable
- Should be indexable
Common Mistakes
- No sitemap - Create one!
- Not submitting to Search Console - Google may not find it
- Including 404s - Remove deleted pages
- Including redirects - Use final destination URL
- Including noindex pages - Don't list pages with noindex tag
- Using HTTP - All URLs should be HTTPS
- Too many URLs - Split into multiple sitemaps
- Not updating - Regenerate when adding/removing pages
- Including blocked URLs - Don't list robots.txt blocked pages
- No lastmod dates - Help crawlers prioritize