Duplicate Content – How Many Homepages Do You Have?

Many site owners don’t quite understand when we tell them we found at least 6 versions of their homepage.  And to be fair, it’s kind of unfortunate that search engines aren’t smart enough to consolidate obvious examples of duplicated homepages.

Here are the most common occurrences of duplicate homepages:

Example.com
Example.com/index.php (or .html)
Example.com/default.aspx
Example.com/home.php (or .html)
www.example.com
www.example.com/index.php (or .html)
www.example.com/default.aspx
www.example.com/home.php (or .html)

Even though you end up seeing the same content at all of these URL’s, they are unique URL’s that can each be indexed by Google, Bing and other search engines.

If your homepage will load (without redirecting) at multiple URL’s like the examples above, you have a great opportunity to make a small tweak that could give you a little boost in the search results.

Link & Authority Consolidation

Perhaps the best reason to eliminate duplicate homepage URL’s is to consolidate all of your links and authority to one URL.  As you can see in the screenshot below, this site has some decent links and authority going to www.ledges.com and ledges.com (without the WWW).  By combining these URL’s (through 301 redirects), we can consolidate those links and authority to make the most out of what we already have.

duplicate content

Common Homepage Duplicate Content Problems & Fixes
Forcing the WWW: Example.com vs. www.Example.com

If your site has this duplicate content issue, it’s probably affecting all of the pages on your site.  It doesn’t necessarily matter whether you force your site to load with or without the WWW as long as you choose one version or the other.  My preference is with the WWW.  As evolved as the Internet is, it seems that most people still expect (and link to) sites using WWW at the beginning.

Forcing the WWW on your site is a pretty easy fix through the .htaccess file on your apache server.

WARNING: htaccess files can be your best friend and your worst enemy.  Make sure you create a backup copy of the original before making any edits.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^domain.com [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [L,R=301]

In this example, you just need to change “domain.com” to your actual domain.

You can also give Google a heads up on your WWW preference by letting them know in your Google Webmaster Tools account.  Go to “Site Configuration”, then “Settings” and tell Google if you want to display URL’s with or without the WWW.

Rewriting Homepage and Index Files

Depending on the CMS, language or other technical intricacies of your website, it’s pretty common to have a homepage file like index.html (.php, .aspx, etc.) or home.html (.php, .aspx, etc.).  For most websites, this is another quick fix that can be done through the .htaccess file of your apache server.

RewriteEngine On
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9} /index.php HTTP/
RewriteRule ^index.php$
http://www.yourdomain.com/ [R=301,L]

In this example, you would just need to enter your website instead of “yourdomain.com” and change “index” and “.php” to match the file on your website.

After fixing both of these issues, your website should resolve to www.yourdomain.com whether you enter yourdomain.com, yourdomain.com/index.html or www.yourdomain.com/index.html.  Over the next few weeks, search engines will crawl your site and evaluate the 301 redirects.  These redirects aren’t the silver bullet that will push you to the top but it’s one of the many pieces to the puzzle that will help you get there.

Get Internet Marketing Insight For Your Company - SEO.com

9 Comments

  1. Hari Kumar says

    Keeping more than one index file is really bad for the website. I have found that certain website people keep one defualt.html index.html files to target various visitors. Those who are ignorant of duplicate pages can be compared to those quacks selling snake oil.

  2. Mark Simchock says

    Thanks for the reminder Bryan.

    However, isn’t this what canonical is supposed to cure? This is especially true if you’re tagging your inbound links and they’re being shared back out again. Depending on the tagging a given page could have 10s or even 100s of unique URLs, right? And thus the support for canonical, yes?

    Granted it’s subjective, but sticking with the www seems dated to me. We no longer see ads with http://www.SomeSiteName.com. Nor does TV or radio verbalize the www. It’s SomeSiteName.com. No one under 40 expects the www, do they? My belief is, the shorter and less cluttered the URL / link the better. The www doesn’t add any value. It’s time to kiss it good-bye. IMHO, of course.
     

    • Bryan Phelps says

      In most cases, we still recommend fixing the root of the problem vs. slapping a bandaid (canonical tag) on the problem.  Canonical definitely has its place and we there are many times we still recommend using it.
      The WWW point is definitely more of a personal preference.  I’ve just seen on many of my sites that when people naturally link to them, they tend to include the http://WWW.  If you’re 301′ing the WWW to the non-WWW, you may lose a bit of that value with the 301.  Again, this isn’t a major issue or concern.  I’d also be happy to see WWW disappear.
      One situation where I DO prefer the WWW is if your site has many subdomains.  The WWW can help you more easily identify content on your root domain vs. a subdomain.
       

  3. Mark Simchock says

    Yeah, that makes sense. I’m all for solving problems at the root :) But perhaps add a bit about canonical to the main article. I think it would help.
    As for subdomains vs www. Valid point. I just prefer as short as possible.
    Thanks again.

  4. Calin Daniel says

    Hi Bryan, 

    Nice little round up of the canonical URL issues. I am also a proponent of the www version as opposed to non www. It would be interesting to conduct research on which version is actually more widely used.  

  5. says

    Oh God! Thank you so much. Now I’ve notice our website having 3 different homepage URL:

    One with /home
    One with non-www
    One with www

    I’m still looking for some more using your example above as my pattern.
    Thank you so much!

  6. Matt says

    It’s funny that search engines which hire thousands of programmers can’t put in a line of code to recognize www and non-www urls as one domain and then they penalize you for duplicate content!
    Nice article Bryan. Cheers
    -Matt

Leave a Reply