More thoughts on Google's sitelinks algorithm
I was writing yesterday about Google's choice of sitelinks for a domain name, and I was speculating, based on the evidence of the links they list for currybetdotnet, that there may be some hand-editing involved. What got me started on this train of thought was an article by Ann Smarty on Search Engine Journal. She suggested six factors that make up the 'sitelinks algorithm'.
- Surfers oriented
- Domain-authority oriented
- Internal-architecture oriented
- On-page SEO oriented
- Brand-strength oriented
- Competition oriented
One of the things that intrigued me about her theories was how well they mapped to my own experience with currybetdotnet. Google sometimes displays a selection of sitelinks under the listing for this blog, and I'm interested as to why they appear, and how the links are chosen. Because they are sure not the ones that I would choose if given the option.
The seven articles Google uses as sitelinks for currybetdotnet are:
- Daily Sport brand hi-jacked by Russian RSS squatters
- Top 50 BBC Podcasts in Google Reader
- Reckless Records RIP - Part 1: An End Has A Start
- Doctor Who and the Vanishing Plaques
- Between a Northern Rock and a hard place
- Bloglines newspaper RSS subscriptions
- Merely trick photograpy
Of the theories that Ann proposes, I think a couple apply to currybetdotnet, but my findings on another couple are quite perplexing. Today I wanted to go through them all in turn.
Domain-authority oriented
This is one of Ann's suggested factors that I think currybetdotnet site scores well on, and contributes to the fact that currybetdotnet has sitelinks. Obviously there are some domains on the web that you instinctively know are authorities, like Amazon or IMDB, but Google needs to measure the whole Internet for signs of whether a site is a good addition to their index or not. My site probably gives out the following 'trust' and 'quality' signals:
- Site has been consistently updated with new content for 6 years
- Site continually attracts new organic backlinks
- Site has gathered backlinks from other reputable domains (like bbc.co.uk, searchengineland.com, telegraph.co.uk, guardian.co.uk etc)
Competition oriented
Ann Smarty suggests that sitelinks appear when a site is a good match for a search query and there is little competition for that query. Testing currybetdotnet seems to back this part of the theory up.
Sitelinks appear for this site if you search for very specific queries related to the domain like "currybet" and "currybetdotnet". However, if you make a broader query which currybetdotnet ranks for, but where there is obvious competition, like "blog bbc media search", the site links disappear.
Similarly, an exact search for "Martin Belam" throws up currybetdotnet with the 'site links' intact. A broader one word search for "martin" lists currybetdotnet on the second page of the SERPS (behind Martin Stabe I note!) but without the 'site links' listed.
Brand-strength oriented
Whether my blog name has become a 'personal brand' is a debate for another day. For me, the significance of this factor in Ann's list is that from Google's point of view, the only site on the web where the phrases 'currybet' or 'currybetdotnet' appear a lot is this one. And virtually every other reference to 'currybet' on the web includes a link back to this domain. Algorithmically speaking, that might be a strong 'brand' indicator to Google.
Internal-architecture oriented
To find out how Google sees the architecture of currybetdotnet, I downloaded two files from Google's Webmaster tools - a list of internal links and a list of external links. I wanted to see if there was any co-relation between these numbers and Google's choice of 'site links'.
For internal structure, the most popular pages were those that appeared in the variable slots in the right-hand navigation at the time Google last deep-crawled the site. These are things that had recently been published, recently commented on, or featured in my popular articles and categories selections. This bore little or no correlation to the 'sitelinks' Google has chosen.
I performed the same exercise with the list of external links, and I found none of the sitelinks to be in the top 100 pages with the most external links pointing at currybet. I can't see that, for this site at least, Google's sitelinks choice is based on linking patterns.
On-page SEO oriented
This theory doesn't really apply to currybetdotnet. My pages are all coded in a search engine friendly way, and because I am using Movable Type as my CMS, all the anchor text is consistent. However, none of the pages on the site are optimised for a specific keyword - those factors are simply based on the title of the article.
Surfers oriented
On this count I don't think the sitelinks Google has chosen for me serves my audience very well at all. Looking at my traffic figures, despite these prominent links from Google I don't see these articles performing especially well page impression wise. The best performing page of the 7 isn't in the top fifty pages viewed so far this year.
Sitelinks factors in numbers
This table sums up the numbers around some of the factors I've looked at for these pages.
Article | Publishing date | Comments | Internal links | External links | Page views (Year-to-date) |
---|---|---|---|---|---|
Daily Sport brand hi-jacked | 4/11/2007 | 0 | 3 | 0 | 42 |
Top 50 BBC Podcasts | 22/11/2007 | 1 | 7 | 0 | 640 |
Reckless Records | 13/8/2007 | 0 | 61 | 1 | 389 |
Dr Who and the Vanishing Plaques | 9/8/2007 | 0 | 5 | 4 | 138 |
Between a Northern Rock | 17/9/2007 | 0 | 3 | 0 | 16 |
Bloglines newspaper blog RSS subscriptions | 23/5/2007 | 0 | 8 | 14 | 127 |
Merely trick photograpy | 7/9/2007 | 1 | 3 | 5 | 147 |
To block or not to block
Now, I don't think that this selection of sitelinks is particularly representative of the best content from currybetdotnet, or the most useful links for me as a consultant - as I'd much rather the 'About' page and my 'CV' were listed. Google's Webmaster tools gives you the option of 'blocking' a page from appearing in the sitelinks list, and I've often wondered whether it would be worth my while blocking these links whilst waiting for Google to choose something better?
I see you've overtaken me for "Martin" SERP position, probably thanks to this post!
In the long run, I think we'd better both watch our backs for Mssrs Moore and Rosenbaum.
Very interesting article ... In fact my site is also showing a sitemap and I can confirm most - but not all - of the points. I´m speeking here about the keyword "Alopezie" and the site alopezie.de, which is a German site about hair loss.
Basically Google seemed to have picked a quite interesting choice of subpages, which does not give a clear at the beginning.
The first column has 4 links to 4 (out of 6) forums. Makes sense that way, the minor ones are left out. Selection seemed to be NOT by hand, as a quite strange abbreviation "Allg." is used (stands for "general"), which is pretty misleading. So I guess they used visitors numbers to select this forum.
The second column of sitelinks is a crude mixture from other places, 3 links are related to content, one to "help on the forum" (for what ??). The 3 links to content are fitting pretty well, how they have been selected ?? - I don´t know ...
1. Surfers oriented:
Yes, the selection makes sense (with the limitation of the title and the strange link to help on Forums)
2. Domain-authority oriented
Yes, the domains is exactly the keyword
3. Internal-architecture oriented
Only partly. The forums are well visible on the site, the other 4 links are very hard to find
4. On-page SEO oriented
Clearly NO, 3 links are horribly bad (Which supports my opinion, that Google does not need SEO-friendly links ...)
5. Brand-strength oriented
Yes, this is clearly fitting, anyhow I doubt that played a role here.
6. Competition oriented
Yes, few competition here. In fact currently place 1 + 2 are take from the site.
I'm beginning to suspect that you don't ACTUALLY believe that they're hand picked, but that you're just saying this to get traffic!
Anyway, you can block individual sitelinks using the Google Webmaster Tools (I just discovered).
Hi Martin,
Interesting Piece. I came across it while I was looking for information on the subject of sitelinks. Maybe you can answer my question, I was wondering how long it takes to get back the sitelinks when you've lost them. Assuming that straight away after you've lost them you would "qualify" to have them again (for whatever reason), would you get them back straight away, or is there a certain waiting period? I ask this because my (also German) website 'toptarif autoversicherung' just lost the sitewides for the word Autoversicherung and I think I would qualify to be having them again (got some additional backlinks and made some onside changes). I'm losing a lot of traffic because of this and as you can imagine I would like to be getting that back :)
Kind Regards,
Frits