On Wednesday 13 August 2003 09:18, tarquin wrote:
Nicholas Knight wrote:
By now it's probably obvious where I'm going with this. Could one of these methods (either storing a parsed and non-parsed version or the approach I took with "de-parsing") be used for some performance gain on Wikipedia's webserver?
De-parsing strikes me as a rather odd way to do it.
It's odd, no doubt about that, just happens to be the best fit in my case. :)
Furthermore, Jimbo has often remarked that disk space is not a problem. (he may come to regret that remark when we hit a million articles.... but hey! ;)
In general I wouldn't expect it to be a problem. It's just a concern on my personal server, which I don't have much in the way of funds avalible for upgrading. Thought I'd throw it out there anyway as it struck me as a rather elegant solution for cases where disk space is a problem. :)
I would suggest we consider semi-parsing. Save two versions of the article: a) wikitext b) the wikitext parsed into HTML, with wikilinks still as [[link]]. Note that this would not be a fully-formed HTML document, just a fragment since it would not have a head section or enclosing tags.
upon page read, it's b) that is inserted into the delivered page. Links are parsed live, since their status as existing / stub / ghost depends on the state of the database at that moment.
Oops! Right, forgot about that since it's not applicable to my script ("all the world's a blog" syndrome? ;)). The 'semi-parsing' solution seems perfect to me.