On Tue, 25 Jan 2005 08:57:02 -0600, Kevin Puetz puetzk@puetzk.org wrote:
The best proposal I've seen so far is to apply it to all recently-modified pages (with 'recent' determined per project by how fast you think the editors will get it cleaned up - probably a day or so is reasonable). That assumes that it will get cleaned up before the timeout expires, and prevents it from being visible to a robot in the interim. But links that survive for a while still become rankable. And it's fairly simple, unlike tracking, per-link, whether or not it has been verified.
Although this *seems* simple, the database request needed to determine "how old is this page" is not, I suspect, a quick one, and server effeciency is a *BIG* issue with the Wikimedia sites. Add to that the complications to do with the cache - the simplest solution might be to block "nofollow" pages from being cached at all, which is an additional waste - and it begins to look like an imbalanced compromise. It would be better to determine more precise circumstances under which the attributes are appropriate, and at least the costs would be well spent.
Personally, I think a combination with new validation or edit patrol features would be a better route to go down - if a state could easily be associated with the article to say "this version has been approved by a trusted user", that would seem a good criterion for removing the "nofollow"s. Of course, you still get some cache awkwardness, but unlike "age of this version", the software could explicitly purge the cache when the approved status changed (since someone would have done that action), so you could have the version with the "nofollow"s in cached until that happened. So you get less cost, and a better match with the general intentions.