512-963-3045
X(ML) Marks the Spot: Your Drupal SEO Guide to XML Sitemaps

As smart as Google’s search engine spiders are, even they can miss pages on your site while indexing for search results. Maybe you have moved a link to content so that it’s not easily accessible. Or, it could be possible your site is too big for Google to crawl without pulling all your server’s resources - not pretty!
The solution is simple: a sitemap. There are three main types of sitemaps you can use on your Drupal site, but we will cover the most important: the XML sitemap. XML sitemaps are designed to be used by search engines for indexing your pages.
Here is a refined definition according to www.sitemaps.org:
Sitemaps are an easy way for Webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.
Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata.
For everything you possibly need to know about Drupal XML sitemaps, please join me after the jump...
Please Note: Using a sitemap does not guarantee that every page on your site will be included in the search engines. Rather, it helps the search engines find more of your pages. Submitting an XML sitemap to Google will significantly increase the number of pages when you do a site:search.
The keyword site: searches show you how many pages of your site are included in the search engine index, as shown in the following screenshot:

The XML Sitemap Module
The XML Sitemap module for Drupal creates a sitemap for your site that conforms to the sitemap.org specifications. You can download it from the following link: http://drupal.org/project/xmlsitemap and install it just like any normal Drupal module. When you go to turn on the module, you’ll see a list that looks similar to this:

Step 1: Before you turn on any of the included modules, consider what content on your site you want to appear in the search engines and only turn on the modules you need.
• The XML sitemap is required. Turn it on.
• XML sitemap custom allows you to add your own customized links to the sitemap. I highly recommend turning this one on as well.
• XML sitemaps engine will automatically submit your sitemap to the search engines each time it changes. This module is not necessary and there are better ways to submit your sitemap. However, it does a great job of helping you verify your site with each search engine. Turn this one on.
• XML sitemap menu adds your menu items to the sitemap. This is a good idea. Turn it on.
• XML sitemap node add all your nodes, which are the bulk of your content. Turn it on.
• XML sitemap taxonomy adds all your taxonomy term pages to the sitemap. Generally, this is a good idea but some folks might not want this listed. Term pages are good category pages so I recommend turning this one on as well.
• Don’t forget to Save configuration.
Step 2: Go to your Administer | Configuration | XML Sitemap and you should be able to see a screen like this:

Step 3: Click on Settings and you should see a few options:

• Minimum sitemap lifetime: It determines that minimum amount of time that the module will wait before renewing the sitemap. Use this feature if you have an enormous sitemap that is taking up too much of your server’s resources. Most sites should keep this setting on No minimum.
• Include a stylesheet in the sitemaps will generate a simple CSS file to include with the sitemap that is generated. This is not necessary but the very helpful to the search engines for troubleshooting or if any human eyes view the sitemap. Leave it checked.
• Generate sitemaps for the following languages: In the near future, this option will allow you to specify sitemaps for different languages. This is very important for international sites who want to rank in localized search engines. For now, however, English is the option.
Step 4: Click the Advanced settings drop-down and you should see this:

• Number of links in each sitemap page allows you to specify how many links to pages on your website will be in each sitemap. Unless you are having trouble with search engines accepting your sitemap, leave this on Automatic.
• Maximum number of sitemap links to process at once sets the number of additional links that the module will add to your sitemap each time the cron runs. Leave this setting alone unless you notice that cron is timing out.
• Sitemap cache directory allows you to set where the sitemap data will be stored. This is data not seen by search engines or human visitors; it’s only used by the module.
• Base URL is the base URL of your site and generally should be left as is.
Step 5: Click on the front page drop-down and set the following options:
• Front page priority: 1.0 is the highest setting you can give a page in the XML sitemap. For most websites, the front page is the single most important part of your site. This setting should be left at 1.0.
• Front page change frequency: Tells the search engines how often they should revisit your front page. Adjust this setting to reflect how often the front page of your site changes.
Step 6: Open the Content types drop-down and you should see this:

• You should see each Content type listed separately. You will want to leave these settings alone so that all your content shows up in the sitemap.
• If you do want to adjust these settings, you will need to go the content type screen. Click on the name of the content type to go to its screen.
• On the content type screen, open the XML sitemap drop-down and you’ll get two options:

• Include in sitemap sets the default action for that content type - if you check this box, it will be included in the sitemap.
• Default priority allows you to set the default for each node that you create of that content type. Default is usually .5 but you can adjust it if you want certain pages with a higher or lower priority.
• Click on Save content type. Repeat this process for each content type you want to change.
Step 7: Click Save configuration.
Step 8: Now it’s time to run cron. Cron is a recurring script that takes care of many maintenance issues in Drupal, including populating the XML sitemap. To run cron, go to http://www.yourDrupalsite.com/cron.php and wait until the page is finished loading. You will receive no indication that it’s complete except that the page will stop loading.
Step 9: Go to http://www.yourDrupalsite.com/sitemap.xml. If you see something like this:

or a screen similar to this:

Then you’ve done it right! Congrats! Another round of espresso shots to all!
Keep in mind that the XML sitemap will only update when cron runs. On a normal Drupal installation, you should set cron to run periodically – nightly for most sites or more often for high-traffic sites.
Thanks For Reading!

Did you find this post entertaining, useful, or interesting? Please repost, retweet, or redistribute to any of the social sites of your choice, and please subscribe to our RSS feed for daily fodder. For every RSS subscription Volacci gets, a kitten earns its whiskers. You like kittens, don’t you? Do the right thing, then. Subscribe.
We also are very interested in what you have to say in response to this blog post. As always we are very grateful for you, our reader, and greatly value your input. Please start a conversation with a comment below.
What I've discovered with Volacci is not only a genuine knowledge of Drupal SEO but a level of professionalism, accountability and reporting that is extremely rare in the web marketing industry.
Weekly Blog entry archives
- Week of May 13, 2012 (2)
- Week of May 6, 2012 (1)
- Week of April 15, 2012 (1)
- Week of April 8, 2012 (2)
- Week of April 1, 2012 (1)
- Week of March 25, 2012 (2)
- Week of March 18, 2012 (1)
- Week of March 4, 2012 (1)
- Week of February 26, 2012 (2)
- Week of February 19, 2012 (1)
- Week of February 12, 2012 (1)
- Week of February 5, 2012 (1)
- Week of January 29, 2012 (2)
- Week of January 22, 2012 (2)
- Week of January 15, 2012 (4)
- Week of January 8, 2012 (1)
- Week of December 25, 2011 (1)
- Week of December 11, 2011 (1)
- Week of December 4, 2011 (2)
- Week of November 27, 2011 (3)
- Week of November 13, 2011 (1)
- Week of November 6, 2011 (2)
- Week of October 30, 2011 (3)
- Week of October 23, 2011 (1)
- Week of October 16, 2011 (1)
- Week of October 9, 2011 (1)
About the author

Ben Finklea
Ben entered the world of online marketing in 1995 when he founded a web design company from his dorm room at Texas A&M University. Since then, he has worked in various capacities in sales and marketing, from tiny start-ups to Apple Computer. In 2001, Ben founded Sprysoft, an e-commerce store that successfully sold over $5M in software online to students, teachers and schools. Ben formed SpryDev Online Marketing in 2005 to use the techniques and processes learned at Sprysoft to help other businesses sell online. SpryDev grew quickly and changed names to Volacci® in 2008.
Ben's book Drupal 6 Search Engine Optimization was released in September 2009 and is available from Amazon.com. In Dec 2010, Lullabot released their Drupal SEO Video on DVD starring Ben.
Twitter: http://www.twitter.com/benfinklea
LinkedIn: http://www.linkedin.com/in/benfinklea

Comments
Dev version - ready for production?
Ben,
I've been using the XMLsitemap module since the early days and was surprised by some of the new features/settings mentioned in your article. I notice that you are using the 6.x-2.x-dev version. Do you think the development version is stable enough for production use? If not, wouldn't it be more appropriate to focus the article on the latest stable release of the XMLsitemap module?
6.x-2.x? Absolutely!
Even in its unstable releases, we were having users with very large production sites using it. We now have an official 6.x-2.0-beta1 release and we're going to start highly recommending everyone be using that beta, because frankly you're going to be much better off than the performance black hole that is the 1.x version.
Plus, the 6.x-2.x and 7.x-2.x versions are kept exactly in sync, so when you're ready to port your site to Drupal 7, the module will be exactly the same.