Duplicate content can happen to anyone, at anytime, and most who are guilty don’t even know they are doing it. In fact, I would say it is one of the most common mistakes, in terms of SEO, that people do who are running Drupal sites. For your Drupal site to rank well in the search engines, it is critical that you eliminate your duplicate content.
But before I show you how to do eliminate duplicate content, let’s take a look at exactly what is considered duplicate content. What exactly does it mean to have duplicate content?
I’m glad you asked. Duplicate content generally refers to, per Google Webmaster Central: “substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”
Examples of non-malicious duplicate content:
• Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices.
• Store items shown or linked via multiple distinct URLs
• Printer-only versions of web pages
In regards to SEO, if you are using duplicate content, then you are hurting your own rankings by competing against yourself. Google will recognize this duplicate content and, in some cases, not distinguish the original source. When this happens, Google spreads out the link juice among the pages evenly, causing your content to compete against itself for rankings. If your site does contain multiple pages with very similar content, there a many ways you can indicate your preferred URL to Google. This is called canonicalization. I will get to canonicalization a bit later. If you are running a Drupal-built site, there are a number of ways (and modules) to prevent Google’s vicious slap of shame dished out for duplicate content. One of the main modules we highly recommend (and comes bundled in the Drupal SEO Checklist module) for superior Drupal SEO performance is the Global Redirect module.
The Global Redirect Module
What does the Global Redirect module do? It takes care of some housekeeping issues that come up when clean URLs are enabled in Drupal, eliminating the duplicate content issues that you may not have known you had.
Let's say, for example, that you create a new website and create the first node that you call the About Us page. Later, because you want the front page of your site to be the content of that node, you go into site settings and make node/1 the front page of the site. Sounds pretty harmless, right? Well, right at this moment, all of these URLs on your site would show the exact same content:
http://www.yourDrupalsite.com/
http://www.yourDrupalsite.com/?q=node/1
http://www.yourDrupalsite.com/node/1
http://www.yourDrupalsite.com/node/1/
http://www.yourDrupalsite.com/about-us
http://www.yourDrupalsite.com/about-us/
The search engines will think that you have six pages of the exact same content. That's never good. Global Redirect fixes that by redirecting all the URLs you don't want to the one URL that you do.
Here's the logic that Global Redirect uses (sourced from http://drupal.org/project/globalredirect):
• Checks the current URL for an alias and does a 301 redirect to it if it is not being used.
• Checks the current URL for a trailing slash, removes it if present, and repeats check 1 with the new request.
• Checks if the current URL is the same as the site_frontpage and redirects to the frontpage if there is a match.
• Checks if the Clean URLs feature is enabled and then checks the current URL is being accessed using the clean method rather than the unclean method.
• Checks access to the URL. If the user does not have access to the path, then no redirects are done. This helps avoid exposing private aliased nodes.
• Make sure the case of the URL being accessed is the same as the one set by the author/administrator. For example, if you set the alias "articles/cake-making" to node/123, then the user can access the alias with any combination of case.
That's a lot of work for one module and it does it quite well.
Thanks for this module, Nicholas Thompson!
How to install and configure the Global Redirect module
Carry out the following steps to install and configure the Global Redirect module:
The Global Redirect module installs just like any other Drupal module. Download it from http://drupal.org/project/globalredirect To configure the module, point your browser to: http://www.yourDrupalsite.com/settings/globalredirect or click on Admin | Site configuration | Global Redirect You should see something like this:
The default settings are the right ones for most websites. However, it can be helpful to know what's going on with Global Redirect. Here is a quick breakdown of what you see above.
• Deslash: Set to On. If enabled, this option will remove the trailing slash from requests. This stops requests such as yourDrupalsite.com/node/1/ failing to match yourDrupalsite.com/node/1 and creating duplicate content. On the other hand, if you require certain requests to have a trailing slash, this feature can cause problems and so may need to be disabled.
• Non-clean to Clean: Set to On. If enabled, this option will redirect from Non-clean to Clean URL (if Clean URL's are enabled). This will stop, for example, node 1 existing on both yourDrupalsite.com/node/1 and yourDrupalsite. com?q=node/1.
• Remove Trailing Zero Argument: Set to Disabled. If enabled, any instance of /0 will be trimmed from the right of the URL. This stops duplicate pages such as taxonomy/ term/1 and taxonomy/term/1/0 where 0 is the default depth. There is an option of limiting this feature to taxonomy term pages only or allowing it to affect any page. By default this feature is disabled to avoid any unexpected behavior.
• Menu Access Checking: Set to Disabled. If enabled, the module will check the user has access to the page before redirecting. This helps to stop redirection on protected pages and avoids giving away secret URL's. By default this feature is disabled to avoid any unexpected behavior.
• Case Sensitive URL Checking: Set to Enabled. If enabled, the module will compare the current URL to the alias stored in the system. If there are any differences in case then the user will be redirected to the correct URL. Click Save configuration. Now your site is protected from duplicate content.
Great job! Thank You For Engaging!