On 9/14/14, Gergo Tisza gtisza@wikimedia.org wrote:
Hi,
I would like to flag a large number of wiki pages based on whether their HTML passes a certain test, so that failing pages can be easily listed and counted. The flags should adapt when pages are created or modified. (The specific use case is collecting file pages which do not have machine-readable author and license information embedded.)
I have been thinking of adding such pages to a maintenance category from a parser hook (the test logic is already part of the imageinfo/extmetadata API and would be easy to reuse), is that a good way to do this? If so, what's the best way to achieve it? Is it OK to just add categories as needed via $parser->getOutput()->addCategory() or can that mess up internal state such as the categorylinks table?
Alternatively, the Cite extension just parses and appends a message to the end of the text on ParserBeforeTidy when it encounters an error, and the message contains wikitext to include a category. That seems like a clever way of maintaining flexibility so it is easy to change the category name or add extra text for a call to action without any need for a code change. Is that approach safe/cheap?
thanks Gergő _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
There's two ways that this is usually done, either page_props table, or tracking categories. Provided the hook you use runs before linksupdate (which is any hook in the parser), you should be fine in adding such things.
To add a page property, you would do something like $parser->getOutput()->setProperty( 'prop name', 'optionally some extra arbitrary data' );
Pages can be found via Special:PagesWithProp or direct db query.
To add a tracking category: $parser->addTrackingCategory( 'tracking cat name' );
You also have to define a message for the tracking category name and a description message, add it to $wgTrackingCategories. See the code docs for Parser::addTrackingCategory and $wgTrackingCategories.
Generally page props are used for obscure things that a user is unlikely to care about or cases where you need special cache invalidation behaviour on change (there's special support for that with $wgPagePropLinkInvalidations), where tracking categories are more properties the user is interested in. Its possible to also make the tracking category by off by default until users turn it "on" by editing a mediawiki namespace page by making the category name defualt to '-'.
In the use case you describe I think tracking category is more suited.
--bawolff