Google Crawlers Now Understand ‘Canonical’ URLs

Migrating a web site from one domain to another is never easy. You’ll probably lose whatever Google ranking your old pages had, possibly break incoming links and generally disrupt the flux capacitor of the web. Of course, there are occasionally good reasons to move your content and now there are some new ways to let […]

Migrating a web site from one domain to another is never easy. You'll probably lose whatever Google ranking your old pages had, possibly break incoming links and generally disrupt the flux capacitor of the web.

Of course, there are occasionally good reasons to move your content and now there are some new ways to let Google know what you're up to. The Google Webmaster blog recently announced that Google will support the cross-domain rel="canonical" link element. That means you can effectively migrate your site to a new domain even if you don't have server access to do redirects.

In most cases, Google still suggests that, if possible, you use 301 permanent redirects to point both visitors and search engine bots to your new domain. However, if that's not possible for some reason, (for example, if you're migrating from a hosted blog service to your own domain) then you can add rel="canonical" element to your page headers and Google will index the new URL.

Note that in our example -- moving from a hosted blogging service to a self-hosted domain -- it's OK if there are some differences between the new and old pages, but the basic content (the blog post) should be the same.

Previously, Google would look down on cases of duplicate content across domains. Given the number of content-stealing "splogs" out there, filtering duplicate content by domains is a good way for Google to stop search engine spam. The problem is there are legitimate reasons to have duplicate content, like migrating a site to a new domain, and now there's a way to do it.

One important note, Google no longer recommends blocking access to duplicate content on your website, whether with a robots.txt file or other methods. Just use the rel="canonical" tag instead.

See Also: