Hi,
I'm looking for a hook that is called when the text of an article is parsed because of a cache miss.
On IRC I was told to use one of the hooks called by Parser::parse, but I am not sure that's the solution: * Parse is also called on cached pages, e.g. to render the 'This page has been accessed xxx times.' message. * There seems to be no elegant way to recognize if the text to parse is the actual Article content.
Currently I hook into ParserBeforeTidy, use a static variable to ensure the method is executed only once and create a Revision object to get at the article text. The method is always run regardless of whether there was a cache miss or not.
There should be an easier and more elegant way to do this, right?
Cheers, Stephan
On Wed, Nov 9, 2011 at 2:35 PM, Stephan Gambke s7eph4n@gmail.com wrote:
Hi,
I'm looking for a hook that is called when the text of an article is parsed because of a cache miss.
What are you trying to accomplish with a hook that's called during (before? after?) parsing that comes after a parser cache miss specifically?
Keep in mind that the same article may be parsed with different configurations (thus separate cache keys) or may not get cached at all. It also gets pre-rendered at default settings before we even check the cache during saving -- would this count as a cache miss?
There may be a better way to do what you're actually trying to do.
- There seems to be no elegant way to recognize if the text to parse is
the actual Article content.
You generally should not make such assumptions as, indeed, many bajillions of things may get parsed, plenty of which are not standalone articles or pages.
-- brion
Hi,
On 10 November 2011 00:32, Brion Vibber brion@wikimedia.org wrote:
I'm looking for a hook that is called when the text of an article is parsed because of a cache miss.
What are you trying to accomplish with a hook that's called during (before? after?) parsing that comes after a parser cache miss specifically?
S*mantic Forms form definition pages work to a degree similar to templates, in that they have some explanatory text in <noinclude> tags and the actual form definition in <includeonly> tags. Currently this form definition is parsed every time a form is requested. I would like to cache it. As I understood it, I can do that by setting a property on the form definition page's ParserOutput object. So my idea was, whenever there is a cache miss on the form definition page I parse the part in <includeonly> tags and cache it along with the page. Then, when a form is actually to be displayed I get the form definition text from cache if available.
Keep in mind that the same article may be parsed with different configurations (thus separate cache keys) or may not get cached at all. It also gets pre-rendered at default settings before we even check the cache during saving -- would this count as a cache miss?
I don't think so. All I wanted was an "Ok, this is outdated, we have to reparse", but I guess it is not that easy. :) Is there somewhere (apart from the code) were I can read up on how exactly caching works in MW?
- There seems to be no elegant way to recognize if the text to parse is
the actual Article content.
You generally should not make such assumptions as, indeed, many bajillions of things may get parsed, plenty of which are not standalone articles or pages.
I noticed. :)
Cheers, Stephan
On Thu, Nov 10, 2011 at 1:04 AM, Stephan Gambke s7eph4n@gmail.com wrote:
S*mantic Forms form definition pages work to a degree similar to templates, in that they have some explanatory text in <noinclude> tags and the actual form definition in <includeonly> tags. Currently this form definition is parsed every time a form is requested. I would like to cache it. As I understood it, I can do that by setting a property on the form definition page's ParserOutput object. So my idea was, whenever there is a cache miss on the form definition page I parse the part in <includeonly> tags and cache it along with the page. Then, when a form is actually to be displayed I get the form definition text from cache if available.
Since the <noinclude> contents are excluded in an early stage of parsing, it probably doesn't make sense to hook into the parser or parser cache here.
A more typical caching pattern within MediaWiki would look something like this:
* devise an appropriate cache key involving the form's id or title, eg wfCacheKey( 'formdata', $page->articleId() ); * at times when you would fetch the form definition data, first pull that key from cache ** if cache hit, use that data ** if cache miss, fall through to existing article fetch & form definition parsing *** after generating that data, save it to cache * when form pages are saved anew, delete the cache entry so it can be regenerated with fresh data
You can grab caches from wfGetMainCache() and friends (by default the main cache is a null-op, whereas wfGetParserCacheStorage() will to go the objectcache table if something like memcache isn't being used, so will always actually store stuff).
-- brion
On Thu, Nov 10, 2011 at 9:13 AM, Brion Vibber brion@pobox.com wrote:
part in <includeonly> tags and cache it along with the page. Then,
when a form is actually to be displayed I get the form definition text from cache if available.
Since the <noinclude> contents are excluded in an early stage of parsing, it probably doesn't make sense to hook into the parser or parser cache here.
^ i mean of course <includeonly>. Blah! :)
-- brion
That's what you get when you are too fixed on one solution, I never even thought of separating caching the page from caching the definition.
Thanks Brion!
Am 10.11.2011 18:13, schrieb Brion Vibber:
A more typical caching pattern within MediaWiki would look something like this:
- devise an appropriate cache key involving the form's id or title, eg
wfCacheKey( 'formdata', $page->articleId() );
- at times when you would fetch the form definition data, first pull that
key from cache ** if cache hit, use that data ** if cache miss, fall through to existing article fetch & form definition parsing *** after generating that data, save it to cache
- when form pages are saved anew, delete the cache entry so it can be
regenerated with fresh data
You can grab caches from wfGetMainCache() and friends (by default the main cache is a null-op, whereas wfGetParserCacheStorage() will to go the objectcache table if something like memcache isn't being used, so will always actually store stuff).
mediawiki-l@lists.wikimedia.org