A Short Drupal Guide to XML Sitemaps
As thorough as the Google spiders can be, it is still possible for them to miss pages while indexing your site. Perhaps you have moved content around so that it is not easily accessible. It can be possible that your site is so big that Google’s spiders can’t crawl it all without completely pulling all your server’s resources. Yowzers!
No matter the problem, there is a solution: a sitemap. Sitemaps are an easy way for Webmasters to inform search engines about pages on their site that are available for crawling. a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL so that search engines can more intelligently crawl the site. There are three different types of sitemaps that you should use in Drupal, and each has a different purpose. Let’s take a look at each one and discuss their advantages.
1. X(ML) Marks the Site
XML sitemaps are designed to be easily used by search engines. In the early 2000s, Google began supporting XML sitemaps. By 2006, Google, Yahoo, Microsoft, and some of the other players all agreed upon the same sitemap specification. They published their specification at http://sitemaps.org. Not surprisingly, the Drupal community stepped up shortly after and created the XML Sitemap module. The XML Sitemap creates a sitemap for your Drupal website that conforms to the sitemap.org specification. The Drupal 6 version was developed by Kiam LaLuno. Dave Reid is working on a version 2.0 of the module to address performance, scalability, and reliability issues.
2. Extra, Extra, Read All About It!
Google News is one of the most popular news sources online, simply by collecting and organizing news articles from other sites. If you are running a news website, you know how powerful it can be if it picks up your comment: a possible 50,000 or more visitors to your website in only an hour or two, if your story gets placed on their front page. But getting listed takes more than just dumb luck. You need great content on timely or newsworthy subjects. You can also download, enable, and configure the Google News XML Sitemap, originally created by Adam Boyse at Webopius and is being maintained by Dave Reid. Download the Google News Sitemap module from http://drupal.org/project/googlenews.
3. URL List Is Now Complete
Let’s play a game. Say for some reason you can’t install an XML Sitemap. Maybe there’s a conflict with another module you’ve installed. Whatever the reason, there is an alternative to an XML Sitemap. It’s not as powerful, but serves as a functional solution. Drupal’s URL List module, maintained by David K. Norman, creates a list of every URL in your site and puts it into a huge text document with one URL on each line and submits it to Google, Yahoo!, and many other search engines in lieu of an XML sitemap.
Using a sitemap does not guarantee that every page on your site will be included in the search engines. What it does is help the search engine spiders find more of your pages. In my experience, submitting an XML Sitemap to Google will greatly increase the number of pages when you do a site:search.