23
Jan
2007
Posted by Skitzzo as Linking, SEO tactics
If you haven’t been hiding under a rock lately, you’ve probably heard that Wikipedia recently reinstated the NoFollow tags for all external links. The English version of the user edited encyclopedia had been the only version without the NoFollow tags (a method, according to Google, of preventing spam).
The move made headlines all across the online community and reaction ranged from support (Randfish at SEOmoz and CarstenCumbrowski at Search Engine Journal) to frustration from some of our SEO forum members (myself included).
But, no matter which side of the issue you come down on, you might be surprised to learn that in fact NOT ALL external links in Wikipedia have the NoFollow tag applied.
That’s right. Despite all of the reports to the contrary, a small number of “clean” outgoing links still exist on Wikipedia.
Here’s a screen shot of what you would see if you went to the wiki page about sabotage

No big deal right? Well, it is when you’re using the handy SEObook extension for FireFox that highlights any NoFollowed links on a given page. This would seem to suggest that in fact NOT all external Wikipedia links have the NoFollow tag applied to them. What gives?
This of course could simply be a fluke, however I found another example when I navigated to the Bond (finance) page.

At this point I shot over to the Global Warming page to make sure my Firefox plugin wasn’t screwed up somehow. Since this page (along with the new SEO contest) was reportedly the reason the NoFollow tags were applied, it only made sense that this page would indeed be full of NoFollow’ed links. Sure enough, those “link condoms” made my screen glow…

So, NoFollow tags certainly are appearing on MOST of the external links on Wikipedia pages. But it is far from ALL of them… In fact, in my quick search for more examples I found another example on the arbitrage page…

As I said, I only searched through 10-15 pages for different terms that came to mind quickly, but I found 5 examples of pages WITHOUT the NoFollow tag applied to external links. I guess “Most Wikipedia Links NoFollowed” isn’t quite as sexy a headline… but in this case, it appears to be an accurate one.
As always, please feel free to comment or join in our discussion of this topic in the forum.
© 2010 acne SEO Refugee - Search Engine Optimization Blog and Forums powered by WordPress

5 Responses
AG
January 23rd, 2007 at 7:10 pm
1To modify every “a href…” to include nofollow tags, Wikipedia needs to effectively regenerate the code for anywhere from one point six to three million pages. (Figures are a bit vague because I can’t recall if nofollow was already enabled on non-article pages). As that’d put undue strain on the system, it doesn’t do it at once; it does it when the cached copy of any given page is cleared out and refreshed. So if the page you’re looking at hasn’t been altered since the change - “sabotage” was last edited on Jan 12th - chances are it hasn’t been recached yet, and so the links haven’t been “nofollowed”.
Wikipedia (and MediaWiki generally) would indeed *like* to have some kind of selective enabling/disabling/whitelisting/whatever for nofollow tags and URLs; please send them any suggestions which can easily scale up to an active community of fifty thousand people and three million pages…
Lea
January 23rd, 2007 at 11:35 pm
2Not a cache issue?
I thought I saw the same thing on some pages, but when I checked again the next day they were nofollowed. I assumed it was my cache clearing, but who knows?
AG: I would have thought the point of a wiki was that every page was generated on the fly from the db?
Remember the editor does NOT enter anchor tags, but wiki markup, so all that needs to be done is alter the way the anchor element is generated from the wiki markup
Trond Sorvoja
January 24th, 2007 at 4:17 am
3It is a server side cache issue, all exernal links has the fraking nofollow attribute. That being said, when are you going to remove the nofollow on the comments in this blog?
AG
January 24th, 2007 at 12:59 pm
4Lea: Many wikis probably do; I’m not that well versed in the details. However, en.WP is the single most heavily used wiki around, by some sizable margin, and one of the ways to cope with the traffic is a very heavy layer of caching - it simplifies matters enormously if only a small proportion of pageviews require a database query, rather than them all.
Basically, the HTML rendering is done a stage earlier than you think it is. My understanding is that almost all pages the casual reader sees* are lumps of pre-generated HTML stored in the cache servers and sent out to readers; whenever the page is edited, or any “component” (templates, images) included into the page is itself altered, the HTML is regenerated from the source by a central server and sent to the caches.
But if you change the way the HTML parser renders the source, it doesn’t know to invalidate the existing HTML - it’s all been done, after all, and nothing has changed in the source. So old things like this can persist for a short while even when most pages have changed over, simply because the system hasn’t needed to regenerate them.
If you want to force the change, to see if it’s working okay, use http://en.wikipedia.org/w/index.php?title=PAGENAME&action=purge - this forces a purge of the cache - and reload.
As to untouched older pages periodically regenerating - I think they are, but I don’t know for sure. This would presumably stop it persisting *too* long, but cache servers can do strange things.
*logged-in users may be handled differently, I vaguely recall.
Skitzzo
January 24th, 2007 at 5:42 pm
5Sorv, the links of your name are not nofollowed, only links entered in the comments and to be honest, I’m not sure where that’s set up. I’ll try and track it down though.
If anyone knows where to look please feel free to share.
RSS feed for comments on this post · TrackBack URI
Leave a reply
previous post: The Complete Guide to Linkbaiting
next post: NoFollow Me To Wikipedia
to top of page...