Duplicate Content

Diagnose and fix duplicate content issues that dilute search rankings and confuse search engines

Duplicate Content

What This Means

Duplicate content occurs when identical or substantially similar content appears on multiple URLs, either within your own website (internal duplication) or across different websites (external duplication). Search engines struggle to determine which version to index and rank, leading to diluted SEO value and potentially lower rankings.

Types of Duplicate Content

Internal Duplication:

Same content on multiple URLs on your site
HTTP vs HTTPS versions
WWW vs non-WWW versions
Trailing slash vs non-trailing slash
URL parameters creating duplicate pages
Print versions of pages
Mobile vs desktop versions (if separate URLs)

External Duplication:

Your content copied to other websites (scraped)
Syndicated content without proper attribution
Product descriptions copied from manufacturers
Press releases on multiple sites

Technical Duplication:

Session IDs in URLs
Tracking parameters (utm_, etc.)
Faceted navigation creating URL variations
Pagination without proper handling
Case-sensitive URLs treated as different pages

Impact on Your Business

Search Rankings:

Search engines don't know which version to rank
Ranking power is diluted across duplicate URLs
Original content may not rank if others outrank you
Can trigger Google filters or penalties (in extreme cases)

Crawl Budget:

Search engines waste time crawling duplicates
Less time spent on unique, valuable pages
Important pages may not get crawled
Slower indexing of new content

Link Equity:

Backlinks split across duplicate URLs
Individual URLs have less ranking power
Link value is diluted instead of consolidated
Harder to build strong page authority

User Experience:

Confusing to find same content on multiple URLs
Inconsistent URLs make sharing difficult
May encounter outdated versions
Reduced trust in website quality

How to Diagnose

Method 1: Google Search Console

Log into Google Search Console
Navigate to "Coverage" report
Look for:
- "Duplicate without user-selected canonical"
- "Duplicate, Google chose different canonical than user"
- Multiple versions of same page indexed
Review "Page Indexing" report for duplicates
Check "Sitemaps" for URLs submitted vs indexed

What to Look For:

Pages flagged as duplicates
Canonical tag conflicts
Multiple versions of homepage indexed
Parameter-based duplicates

Method 2: Site: Search Operator

Google: site:yourwebsite.com "exact page title"
Review how many results appear
Check if multiple URLs have same content
Look for HTTP/HTTPS and WWW variations

What to Look For:

Multiple results for same title
Different URLs with identical content
Protocol variations (http/https)
Subdomain variations (www/non-www)

Method 3: Screaming Frog SEO Spider

Download Screaming Frog
Crawl your website
Navigate to "Content" → "Duplicate" tab
Review:
- Duplicate pages (exact match)
- Near duplicates (similar content)
- Duplicate titles
- Duplicate meta descriptions

What to Look For:

Pages with 100% content similarity
Pages with >90% similarity (near duplicates)
Duplicate title tags
URL patterns creating duplicates

Method 4: Copyscape or Similar Tools

Visit Copyscape
Enter your page URL
Search for duplicate content online
Review results for:
- External sites with your content
- How much content is duplicated
- Whether proper attribution exists

What to Look For:

Content scrapers copying your pages
Syndicated content without canonical
Competitor sites with your content
Product descriptions on multiple sites

Method 5: Check URL Variations

Manually test common duplicates:

# Test these URL variations for your homepage:
https://www.example.com
https://example.com
http://www.example.com
http://example.com
https://www.example.com/
https://www.example.com/index.html
https://www.example.com/index.php
https://www.example.com/?

What to Look For:

Multiple variations loading successfully
200 OK status on all variations
No redirects to preferred version
Different URLs serving same content

General Fixes

Fix 1: Set Preferred Domain with Canonical Tags

Tell search engines which version is the original:

Add canonical tag to all pages:

<head>
  <link rel="canonical" href="https://www.example.com/page/">
</head>

Point all duplicate versions to canonical:

<!-- On https://example.com/page/ -->
<!-- On http://www.example.com/page/ -->
<!-- On http://example.com/page/ -->
<link rel="canonical" href="https://www.example.com/page/">

Self-referencing canonical on preferred version:

<!-- On https://www.example.com/page/ -->
<link rel="canonical" href="https://www.example.com/page/">

Canonical for URL parameters:

<!-- On https://www.example.com/page/?utm_source=email -->
<link rel="canonical" href="https://www.example.com/page/">

Fix 2: Implement 301 Redirects

Permanently redirect duplicates to preferred version:

Redirect HTTP to HTTPS:

# Nginx
server {
    listen 80;
    server_name example.com www.example.com;
    return 301 https://www.example.com$request_uri;
}

# Apache .htaccess
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]

Redirect non-WWW to WWW (or vice versa):

# Nginx - non-WWW to WWW
server {
    listen 443 ssl;
    server_name example.com;
    return 301 https://www.example.com$request_uri;
}

# Apache .htaccess - non-WWW to WWW
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]

Redirect trailing slash inconsistencies:

# Nginx - add trailing slash
rewrite ^([^.]*[^/])$ $1/ permanent;

Redirect old URLs to new URLs:

# Apache .htaccess
Redirect 301 /old-page.html https://www.example.com/new-page/
Redirect 301 /old-category/ https://www.example.com/new-category/

Fix 3: Use URL Parameters Tool in Search Console

Tell Google how to handle parameters:

Log into Google Search Console
Navigate to legacy "Crawl" → "URL Parameters"
Add parameters and specify behavior:
- Passive - Doesn't change page content (e.g., utm_source)
- Active - Changes content (e.g., color, size)
For passive parameters: Select "No: Doesn't change page content"
For active parameters: Specify representative URL

Common parameters:

utm_* - Passive (tracking)
sessionid - Passive (tracking)
sort - Active (changes content)
page - Active (pagination)
color, size - Active (filters)

Fix 4: Implement Rel="Next" and Rel="Prev" for Pagination

Handle paginated content properly:

On paginated series:

<!-- Page 1 (https://example.com/blog/) -->
<head>
  <link rel="canonical" href="https://example.com/blog/">
  <link rel="next" href="https://example.com/blog/page/2/">
</head>

<!-- Page 2 (https://example.com/blog/page/2/) -->
<head>
  <link rel="canonical" href="https://example.com/blog/page/2/">
  <link rel="prev" href="https://example.com/blog/">
  <link rel="next" href="https://example.com/blog/page/3/">
</head>

<!-- Page 3 (https://example.com/blog/page/3/) -->
<head>
  <link rel="canonical" href="https://example.com/blog/page/3/">
  <link rel="prev" href="https://example.com/blog/page/2/">
</head>

Or use "View All" page approach:

<!-- On paginated pages -->
<link rel="canonical" href="https://example.com/blog/all/">

E-commerce and filtered pages:

Use canonical tags for filtered URLs:

<!-- https://example.com/shoes?color=red&size=10 -->
<link rel="canonical" href="https://example.com/shoes/">

Or use noindex for filter combinations:

<!-- On filtered pages -->
<meta name="robots" content="noindex, follow">

Use clean URLs for important filters:

<!-- Instead of: /shoes?color=red -->
<!-- Use: /shoes/red/ -->
<link rel="canonical" href="https://example.com/shoes/red/">

AJAX-based filters (don't change URL):

// Filter content without changing URL
// No duplicate URL created

Fix 6: Handle Print and Mobile Versions

Separate print/mobile URLs:

Print versions:

<!-- On print version page -->
<link rel="canonical" href="https://example.com/article/">

<!-- Or use CSS print styles instead of separate URL -->
<style>
  @media print {
    /* Print styles */
  }
</style>

Mobile versions (if using separate m. subdomain):

<!-- On desktop version (www.example.com) -->
<link rel="alternate" media="only screen and (max-width: 640px)"
      href="https://m.example.com/page/">

<!-- On mobile version (m.example.com) -->
<link rel="canonical" href="https://www.example.com/page/">

Preferred approach: Responsive design (no separate URLs):

<!-- Single URL serves both desktop and mobile -->
<!-- No duplicate content issue -->

Fix 7: Handle Syndicated Content

When publishing content on multiple sites:

Add canonical tag on syndicated versions:

<!-- On partner site publishing your content -->
<link rel="canonical" href="https://www.yoursite.com/original-article/">

Wait before syndicating:
- Publish on your site first
- Wait 1-2 weeks for indexing
- Then syndicate to other sites
- Include canonical tag or "originally published" link

Add attribution:

<p>Originally published on
   <a href="https://www.yoursite.com/article/">YourSite.com</a>
</p>

Use excerpt or modified version:
- Don't publish 100% duplicate
- Syndicate excerpt with link to full article
- Or create unique version for syndication

Platform-Specific Guides

Detailed implementation instructions for your specific platform:

Platform	Troubleshooting Guide
Shopify	Shopify Duplicate Content Guide
WordPress	WordPress Duplicate Content Guide
Wix	Wix Duplicate Content Guide
Squarespace	Squarespace Duplicate Content Guide
Webflow	Webflow Duplicate Content Guide

Verification

After implementing fixes:

Check redirects:

curl -I https://example.com
curl -I http://example.com
curl -I http://www.example.com
# All should 301 redirect to preferred version

Verify canonical tags:
- View source on all pages
- Confirm canonical tag present
- Verify pointing to correct URL
- Check consistency across site
Google Search Console:
- Wait 2-4 weeks for re-crawling
- Check "Coverage" report
- Verify duplicate warnings reduced
- Monitor indexed pages count
Site: search test:
- Google: site:yourwebsite.com "page title"
- Should see only one result
- Verify preferred version appears
- Check other versions redirect
Screaming Frog re-crawl:
- Run new crawl
- Check duplicates tab
- Verify duplicates eliminated
- Confirm redirects in place

Common Mistakes

No canonical tags - Letting search engines guess
Inconsistent canonical tags - Different tags on same content
Canonical to non-canonical URL - Self-referencing wrong version
No 301 redirects - Relying only on canonical (use both)
Ignoring URL parameters - Creating unlimited duplicates
Multiple domains with same content - Splitting authority
Not handling WWW vs non-WWW - Common duplicate source
HTTP and HTTPS both accessible - Protocol duplication
Trailing slash inconsistencies - Both versions accessible
Syndicating without canonical - External duplicates hurting SEO