On Fri, Mar 12, 2010 at 9:09 AM, Maarten Dammers maarten@mdammers.nl wrote:
I understand this makes caching more difficult, but did anyone ever do measurements? Without decent metrics this is just wild guessing. Things to measure for example:
- Total text requests for all sites
- Total text requests for just Commons
- Total misses for all sites
- Total misses for just Commons
I'm sure someone can dig up these metrics.
All this stuff is measured and stored in various places, yes. I don't have the data handy, but the Squid hit rate is well over 95% last I heard for simple page views.
Worst case all text requests for Commons go to the backend directly, right?
Yes, and this is disastrous. It would mean load on the application servers would increase by a factor of twenty or a hundred or more. Even if only for Commons, it could be a huge increase in load. Keeping the Squid hit rate as high as possible is absolutely essential to keep the site running properly, barring a huge investment in new hardware.
Remember that if you even create two variants that are widely used, you've come reasonably close to doubling load on the backend. Only one person from each variant needs to view any given revision of a page to force the backend to generate two copies instead of one. Causing visits to Commons to be split into twenty different languages might double backend load by itself. That's just not tenable.
In principle, one could imagine hacking Squid to be smart enough to cache contents and interface separately, and paste them together on view about as quickly as it can serve plain requests now. But I don't know how feasible that would be in practice.