On Thu, Jun 25, 2009 at 8:14 PM, Domas Mituzasmidom.lists@gmail.com wrote:
The problem is quite simple, lots of people (like, million pageviews on an article in an hour) caused a cache stampede (all pageviews between invalidation and re-rendering needed parsing), and as MJ article is quite cite-heavy (and cite problems were outlined in http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/41547 ;) the reparsing was very very painful on our application cluster - all apache children eventually ended up doing lots of parsing work and consuming connection slots to pretty much everything :)
So if two page views are trying to view the same uncached page at the same time with the same settings, the later ones should all block on the first one's reparsing instead of doing it themselves. It should provide faster service for big articles too, even ignoring load, since the earlier parse will be done before you could finish yours anyway.
That seems pretty easy to do. You'd have some delays if everything waited on a process that died or something, of course.