Hello,
I understand the need for cite, thats why it is still there :) But...
- We format Cite references list every 100th request to backend, though it takes 8.15% backend response time (thanks parser cache, without it Cite formatting would take 815% cluster time - though developers should understand I'm not exactly right at this hyperbole ;-)
- When parsing articles like one of most popular today, [[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the page, 17s is spent on Cite block, executing {{cite}} mostly. That makes every editor wait for ages to get a page displayed, and due to cache stampede after invalidation it causes considerable stress on site (look at numbers mentioned above).
- This 8% is in real-time, which includes waiting for search, databases, and simply CPU contention, which we end up having today. CPU-time wise it is way higher, so can actually have 20% CPU time impact on our application farm. Thats at least 100k$ worth of hardware (and rising), even if new/modern one, just for citation formatting.
So, a checklist what can be done ( simple to complex )
[ ] - Simplification of {{cite}} [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) [ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua - another can of worms, though yet again, can be managed via trusted set of people, on top20 wikis or so). [ ] - Frustrated operations guy adding something like ( return ""; ) in some random extension, and syncing the live hack. Obviously there would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.
I for one can directly participate in at least two of these options. ;-)
Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more "war on ..." topics ;-D
Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. As we've actually managed to hit 100% last week, something what hasn't happened for a while, some of work has to be done here.
Of course, new hardware will help for a while, but I for one have huge personal satisfaction saving donation money. ;-)
CHEERS!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Sat, Jan 31, 2009 at 2:03 PM, Domas Mituzas wrote:
Hello,
I understand the need for cite, thats why it is still there :) But... (...)
What about converting these to ref tags?
Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more "war on ..." topics ;-D
Stub templates, for example :D
Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load.
Wow. Can you compare the load to the systems with the load caused by solely using tags?
Marco
I understand the need for cite, thats why it is still there :) But... (...)
What about converting these to ref tags?
Unfortunately most of those are designed to format the ref's to a "proper" standard that we use (Harvard/MLA standard iirc) and are designed to easily updated when we change out standards (eg: recently the "pages" value changed in one of the cite templates and a bot when though and fixed them all)
Domas, have you performed any further analysis to figure out _how_ can be the template optimized? Would, say, reducing size help or complicatedness caused by it outweighs the advantage?
— Kalan
A long while ago I remember looking at the parser and realizing that the recursive template expansion and argument handling led the parser to run all branches of #if and #switch statements before deciding which one to include.
In other words, given {{#if: something | statements_A | statements_B }}, the parser was fully expanding both statements_A and statements_B before checking #if to decide which one to keep. Obviously that is inefficient and in the case of very complicated conditional templates potentially very expensive.
The parser has changed so much since I last worked with it that I am having difficulty figuring out if this is still true. Hopefully, someone already went through and improved the branch handling logic, but if not, I would suggest that this would also be a good generalized target for improving template operation.
-Robert Rohde
On Sat, Jan 31, 2009 at 5:03 AM, Domas Mituzas midom.lists@gmail.com wrote:
Hello,
I understand the need for cite, thats why it is still there :) But...
- We format Cite references list every 100th request to backend,
though it takes 8.15% backend response time (thanks parser cache, without it Cite formatting would take 815% cluster time - though developers should understand I'm not exactly right at this hyperbole ;-)
- When parsing articles like one of most popular today,
[[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the page, 17s is spent on Cite block, executing {{cite}} mostly. That makes every editor wait for ages to get a page displayed, and due to cache stampede after invalidation it causes considerable stress on site (look at numbers mentioned above).
- This 8% is in real-time, which includes waiting for search,
databases, and simply CPU contention, which we end up having today. CPU-time wise it is way higher, so can actually have 20% CPU time impact on our application farm. Thats at least 100k$ worth of hardware (and rising), even if new/modern one, just for citation formatting.
So, a checklist what can be done ( simple to complex )
[ ] - Simplification of {{cite}} [ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too) [ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
- another can of worms, though yet again, can be managed via trusted
set of people, on top20 wikis or so). [ ] - Frustrated operations guy adding something like ( return ""; ) in some random extension, and syncing the live hack. Obviously there would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.
I for one can directly participate in at least two of these options. ;-)
Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more "war on ..." topics ;-D
Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. As we've actually managed to hit 100% last week, something what hasn't happened for a while, some of work has to be done here.
Of course, new hardware will help for a while, but I for one have huge personal satisfaction saving donation money. ;-)
CHEERS!
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Would storing an intermediate template improve things? I mean, keep a template but where the inner templates are substed, depending on the original parameters.
Robert Rohde wrote:
A long while ago I remember looking at the parser and realizing that the recursive template expansion and argument handling led the parser to run all branches of #if and #switch statements before deciding which one to include.
In other words, given {{#if: something | statements_A | statements_B }}, the parser was fully expanding both statements_A and statements_B before checking #if to decide which one to keep. Obviously that is inefficient and in the case of very complicated conditional templates potentially very expensive.
The new preprocessor don't follow unused branches (or so were we told ;).
http://en.wikipedia.org/wiki/Template:Citation/core screams for having loops
Robert Rohde wrote:
A long while ago I remember looking at the parser and realizing that the recursive template expansion and argument handling led the parser to run all branches of #if and #switch statements before deciding which one to include.
In other words, given {{#if: something | statements_A | statements_B }}, the parser was fully expanding both statements_A and statements_B before checking #if to decide which one to keep. Obviously that is inefficient and in the case of very complicated conditional templates potentially very expensive.
The parser has changed so much since I last worked with it that I am having difficulty figuring out if this is still true. Hopefully, someone already went through and improved the branch handling logic, but if not, I would suggest that this would also be a good generalized target for improving template operation.
No it's not still true, yes dead branches are now eliminated. This was done at a significant cost to code complexity and there was quite a lot of overhead. The elimination of dead branches is the only reason the new parser has comparable performance to the old parser, otherwise it would have been slower.
-- Tim Starling
Domas Mituzas wrote:
So, a checklist what can be done ( simple to complex )
[ ] - Simplification of {{cite}}
Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can
[ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too)
I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the <ref> tags themselves with pre-defined "templates" written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser.
You would have something like:
<ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
Any parameters in the ref tag would be converted to HTML output using the "book" template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated.
The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing.
[ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
- another can of worms, though yet again, can be managed via trusted
set of people, on top20 wikis or so). [ ] - Frustrated operations guy adding something like ( return ""; ) in some random extension, and syncing the live hack. Obviously there would be some "HAHA YOU THOUGHT I COULDN'T DO THIS" comments in there.
I for one can directly participate in at least two of these options. ;-)
Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more "war on ..." topics ;-D
Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. As we've actually managed to hit 100% last week, something what hasn't happened for a while, some of work has to be done here.
Of course, new hardware will help for a while, but I for one have huge personal satisfaction saving donation money. ;-)
CHEERS!
On Sat, Jan 31, 2009 at 1:28 PM, Alex mrzmanwiki@gmail.com wrote:
Domas Mituzas wrote:
So, a checklist what can be done ( simple to complex )
[ ] - Simplification of {{cite}}
Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can
[ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too)
I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the <ref> tags themselves with pre-defined "templates" written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser.
You would have something like:
<ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
Any parameters in the ref tag would be converted to HTML output using the "book" template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated.
The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing.
What about throwing them in MediaWiki: space, similar to editnotices? At least then they could be cached to hell and back in the message cache.
-Chad
Chad wrote:
On Sat, Jan 31, 2009 at 1:28 PM, Alex mrzmanwiki@gmail.com wrote:
Domas Mituzas wrote:
So, a checklist what can be done ( simple to complex )
[ ] - Simplification of {{cite}}
Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can
[ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too)
I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the <ref> tags themselves with pre-defined "templates" written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser.
You would have something like:
<ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
Any parameters in the ref tag would be converted to HTML output using the "book" template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated.
The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing.
What about throwing them in MediaWiki: space, similar to editnotices? At least then they could be cached to hell and back in the message cache.
-Chad
I considered that as well, but I'm not sure how much that will actually help. Looking at http://en.wikipedia.org/wiki/Joe%20the%20Plumber?action=purge&forceprofi...
it took 21.796 seconds to load, most of which seems be from Parser::recursiveTagParse, about 90% of that that is from Cite::referencesFormat-parse. Even if the templates themselves are heavily cached, it still has to run all the conditionals and formatting through the parser. Heavy caching might help if there's lots of refs with the same content on multiple pages, but I don't think that's very common.
On Sat, Jan 31, 2009 at 5:37 PM, Alex mrzmanwiki@gmail.com wrote:
Chad wrote:
On Sat, Jan 31, 2009 at 1:28 PM, Alex mrzmanwiki@gmail.com wrote:
Domas Mituzas wrote:
So, a checklist what can be done ( simple to complex )
[ ] - Simplification of {{cite}}
Short of significant improvements to the parser or requireing people to ask Domas before editing the template, I can
[ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win, but there is theoretical chance of stripping 1% or so. ;) [ ] - Offload some templates like {{cite}} to actual PHP extensions (can of worms, but, oh well, can be standardized process too)
I've actually considered something like this in the past, basically creating a Cite 2.0 extension, where all the main cite options would be in the <ref> tags themselves with pre-defined "templates" written in PHP for web citations, book citations, etc.; this would greatly reduce the amount of stuff that needs to be done using the Cite wiki-templates and run through the parser.
You would have something like:
<ref author="Foo" title="Bar" type="book">Pages 1-10</ref>
Any parameters in the ref tag would be converted to HTML output using the "book" template in the extension rather than a thousand parser functions in some meta-template, and only the content of the tag (the page numbers in this case) would have to be run through the parser, so it would also be backwards-compatible with the current templates until they can all be migrated.
The main downside to this is that it requires someone to file a Bugzilla request every time a template needs changing.
What about throwing them in MediaWiki: space, similar to editnotices? At least then they could be cached to hell and back in the message cache.
-Chad
I considered that as well, but I'm not sure how much that will actually help. Looking at
http://en.wikipedia.org/wiki/Joe%20the%20Plumber?action=purge&forceprofi...
it took 21.796 seconds to load, most of which seems be from Parser::recursiveTagParse, about 90% of that that is from Cite::referencesFormat-parse. Even if the templates themselves are heavily cached, it still has to run all the conditionals and formatting through the parser. Heavy caching might help if there's lots of refs with the same content on multiple pages, but I don't think that's very common.
-- Alex (wikipedia:en:User:Mr.Z-man)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Throw a caching layer on top of it. Do a final expansion until final substitution at the {{cite book}} etc level. Then you've got less to recursively parse.
-Chad
On Sat, Jan 31, 2009 at 8:03 AM, Domas Mituzas midom.lists@gmail.com wrote:
[ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
- another can of worms, though yet again, can be managed via trusted
set of people, on top20 wikis or so).
This seems like it's the only solution from your list that would be generally applicable to similar future scenarios. I don't think the users would have to be particularly trusted -- just make sure that the runtime of the programs is limited, and that it's properly sandboxed (is the Lua PECL extension sandboxed?).
Another thought that occurs to me is to cache the output of templates as a function of their parameters and any appropriate variables they use (like {{PAGENAME}} or {{CURRENTDAY}}). Then a reparse of a template-heavy page will generally only have to reparse templates if the parameters to the template have changed. This will save a lot on Cite, infoboxes, etc.
Aryeh Gregor wrote:
On Sat, Jan 31, 2009 at 8:03 AM, Domas Mituzas midom.lists@gmail.com wrote:
[ ] - Implement proper scripting engine like Lua for metatemplates (http://pecl.php.net/package/lua
- another can of worms, though yet again, can be managed via trusted
set of people, on top20 wikis or so).
This seems like it's the only solution from your list that would be generally applicable to similar future scenarios. I don't think the users would have to be particularly trusted -- just make sure that the runtime of the programs is limited, and that it's properly sandboxed (is the Lua PECL extension sandboxed?).
That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua. And I don't think worth reimplementing a Lua interpreter in php...
On Sat, Jan 31, 2009 at 8:19 PM, Platonides Platonides@gmail.com wrote:
That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua.
Yes, that would be the major disadvantage I can see. In practice, nobody can reuse large chunks of Wikipedia content on shared hosting anyway, since it's way too big, but it would be a serious obstacle for people who want to reuse only parts of Wikipedia.
^_^ Wikipedia is already a horrible place to copy templates from. Unlike Wikipedia most other MW installations don't bother turning on Tidy, and Wikipedia abuses that /feature/ way to much.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com] -Nadir-Point (http://nadir-point.com) -Wiki-Tools (http://wiki-tools.com) -MonkeyScript (http://monkeyscript.nadir-point.com) -Animepedia (http://anime.wikia.com) -Narutopedia (http://naruto.wikia.com) -Soul Eater Wiki (http://souleater.wikia.com)
Aryeh Gregor wrote:
On Sat, Jan 31, 2009 at 8:19 PM, Platonides Platonides@gmail.com wrote:
That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua.
Yes, that would be the major disadvantage I can see. In practice, nobody can reuse large chunks of Wikipedia content on shared hosting anyway, since it's way too big, but it would be a serious obstacle for people who want to reuse only parts of Wikipedia.
Hoi, Let us please appreciate what is being said here: "Wikipedia is a horrible place to copy templates from". We pride ourselves of being open source and the current templates make us as bad as the worst proprietary vendor. We have what is effectively an API and it is not documented at all. Thanks, GerardM
2009/2/1 Daniel Friesen dan_the_man@telus.net
^_^ Wikipedia is already a horrible place to copy templates from. Unlike Wikipedia most other MW installations don't bother turning on Tidy, and Wikipedia abuses that /feature/ way to much.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://nadir-seen-fire.com] -Nadir-Point (http://nadir-point.com) -Wiki-Tools (http://wiki-tools.com) -MonkeyScript (http://monkeyscript.nadir-point.com) -Animepedia (http://anime.wikia.com) -Narutopedia (http://naruto.wikia.com) -Soul Eater Wiki (http://souleater.wikia.com)
Aryeh Gregor wrote:
On Sat, Jan 31, 2009 at 8:19 PM, Platonides Platonides@gmail.com
wrote:
That would be like adding a dependancy on Lua extension for reusers, as the core templates will be implemented in Lua.
Yes, that would be the major disadvantage I can see. In practice, nobody can reuse large chunks of Wikipedia content on shared hosting anyway, since it's way too big, but it would be a serious obstacle for people who want to reuse only parts of Wikipedia.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Sat, Jan 31, 2009 at 9:16 PM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, Let us please appreciate what is being said here: "Wikipedia is a horrible place to copy templates from". We pride ourselves of being open source and the current templates make us as bad as the worst proprietary vendor. We have what is effectively an API and it is not documented at all. Thanks, GerardM
Actually, I think Daniel had a somewhat different point.
Wikimedia uses Tidy which does a good job at closing dangling format tags. A very substantial fraction of our templates actually have dangling divs, and tables, and other bad syntax that Tidy is covering up for us. Anyone who has ever tried to copy Wikimedia templates into a wiki with Tidy turned off (the default setting) knows that many of our templates will actually return a lot of junk.
Strictly speaking it should be the editors' job to properly close tables and divs, etc., but because Tidy is so good at it they don't have to, which makes our wikicode less portable.
-Robert Rohde
Robert Rohde wrote:
Strictly speaking it should be the editors' job to properly close tables and divs, etc., but because Tidy is so good at it they don't have to, which makes our wikicode less portable.
-Robert Rohde
The problem is not that Tidy is good but that it is silent. Thus editors don't even know that they should be fixing anything.
On Sun, Feb 1, 2009 at 9:40 AM, Platonides Platonides@gmail.com wrote:
The problem is not that Tidy is good but that it is silent. Thus editors don't even know that they should be fixing anything.
Why do you think that they should, if Tidy does it anyway?
Domas Mituzas wrote:
- When parsing articles like one of most popular today,
[[en:Rod_Blagojevich_corruption_charges]], it takes 20s to produce the page, 17s is spent on Cite block, executing {{cite}} mostly. That makes every editor wait for ages to get a page displayed, and due to cache stampede after invalidation it causes considerable stress on site (look at numbers mentioned above).
Can you say how you measured this? What function you patched, what the code was, etc.?
-- Tim Starling
Can you say how you measured this? What function you patched, what the code was, etc.?
I was checking 'Cite::referencesFormat-parse' profiling hook, as it is prominent on generic profiling, is not reentrant, and can be easily isolated in ?forceprofile=true outputs like here:
http://en.wikipedia.org/wiki/Rod_Blagojevich_corruption_charges?action=purge...
(you have to be logged in, view source).
BR,
On Sat, Jan 31, 2009 at 5:03 AM, Domas Mituzas midom.lists@gmail.com wrote:
[ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win,
Domas and I worked on this on IRC for a bit just now, and the change has been synced to Wikimedia wikis. It generates the cache key from an md5 of the input to the parser and the page-id. Render hash could be included in this if it causes problems, but I'm not sure what stuff in references will depend on the render hash and it may be safe to keep it out of the cache key.
The change has been pretty handy, from what I can see. Reports I've received indicate that render time for [[en:Rod Blagojevich corruption charges]] dropped from 10.684 s to 2.700 s.
in few minutes Cite cache dropped Cite out of our profiling top50 page, and even though it accounted for 10% of cluster load today, it is under 3% already. I claim 90% of honor for this improvement, the rest goes to Andrew, who implemented it all ;-D
Do note, the war is not over yet, we will have to think how to resolve metatemplates properly :)
BR,
We could always go back to no templates at all ;-)
-Chad
On Feb 2, 2009 2:58 PM, "Domas Mituzas" midom.lists@gmail.com wrote:
in few minutes Cite cache dropped Cite out of our profiling top50 page, and even though it accounted for 10% of cluster load today, it is under 3% already. I claim 90% of honor for this improvement, the rest goes to Andrew, who implemented it all ;-D
Do note, the war is not over yet, we will have to think how to resolve metatemplates properly :)
BR,
-- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]] ________________________________________...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/...
You guys appear to have broken something.
Look at ref #13 at the bottom of [1]
Instead of saying [[Discover]] it magically says [[encephalitis]] for the magazine name.
-Robert Rohde
[1] http://en.wikipedia.org/w/index.php?title=Visna_virus&diff=268100404&...
On Mon, Feb 2, 2009 at 11:56 AM, Domas Mituzas midom.lists@gmail.com wrote:
in few minutes Cite cache dropped Cite out of our profiling top50 page, and even though it accounted for 10% of cluster load today, it is under 3% already. I claim 90% of honor for this improvement, the rest goes to Andrew, who implemented it all ;-D
Do note, the war is not over yet, we will have to think how to resolve metatemplates properly :)
BR,
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
2009/2/2 Andrew Garrett andrew@epstone.net:
Domas and I worked on this on IRC for a bit just now, and the change has been synced to Wikimedia wikis. It generates the cache key from an md5 of the input to the parser and the page-id. Render hash could be included in this if it causes problems, but I'm not sure what stuff in references will depend on the render hash and it may be safe to keep it out of the cache key.
Heh. How much of an evil hack is this?
- d.
Andrew Garrett wrote:
On Sat, Jan 31, 2009 at 5:03 AM, Domas Mituzas midom.lists@gmail.com wrote:
[ ] - Separate cache for Cite, to avoid reparsing on minor edits, that don't involve citations. I have no idea how much this would win,
Domas and I worked on this on IRC for a bit just now, and the change has been synced to Wikimedia wikis. It generates the cache key from an md5 of the input to the parser and the page-id. Render hash could be included in this if it causes problems, but I'm not sure what stuff in references will depend on the render hash and it may be safe to keep it out of the cache key.
The change has been pretty handy, from what I can see. Reports I've received indicate that render time for [[en:Rod Blagojevich corruption charges]] dropped from 10.684 s to 2.700 s.
What would <ref>{{ {{PAGENAME}}/Citations }}</ref> return?
-- Tim Starling
wikitech-l@lists.wikimedia.org