[Mediawiki-api] Limited category intersection

Jesse Martin (Pathoschild) pathoschild at gmail.com
Wed Jul 11 14:57:35 UTC 2007


Hello,

Limited category intersection as part of the API would be extremely
useful for wikis with overlapping categories. I discussed this on IRC
with Simetrical, and it seems that limiting intersection to categories
of 5000 or less members each should not use excess resources for a bot
API (although benchmarks would be needed).

PHP intersection of very large categories took me less that one second
(with longer times to retrieve category members before intersection)
using a PHP script. It should be faster in SQL, since it retrieves and
intersects simultaneously.

The SQL query itself should be relatively simple. The URL query might
look something like the following, which would also make union (list
of all pages in either category) possible by changing &return.
http://en.wikisource.org/w/api.php?action=query&list=categorymembers&cmprop=title&cmlimit=5000&cmcategory=Poems|Ancient_works&return=intersection

Does this seem feasible? I could try my hand at coding it if nobody
else will, but I have only a patchwork knowledge of PHP and SQL.

Yours cordially,
Jesse Martin (Pathoschild)



More information about the Mediawiki-api mailing list