Hi, Quick glance at it, and seems the index is the wrong way around?
ci_hash should be first, as that's what your searching by primarily.
Jared
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Magnus Manske Sent: 06 March 2008 21:16 To: Wikimedia developers; Wikimedia Commons Discussion List Subject: [Wikitech-l] Category intersection: New extension available
(cross-posting wikitech-l and commons-l)
I created a new extension ("CategoryIntersection") that allows for quick lookup of pages (and image) in intersecting categories. That would enable wiki(m|p)edia sites to use categories as tags, eliminating the need for oh-so-specialized categories.
Intersection of two categories works very fast, but intersecting more categories is possible, and already implemented; the maximum number can be limited.
I tried it on my (mostly empty) MediaWiki test setup, and it works peachy. However, *I NEED HELP* with
- testing it on a large-scale installation
- integrating it with MediaWiki more tightly (database
wrappers, caching, etc.)
- Brionizing the code, so it actually has a chance to be used
on Wikipedia and/or Commons
Techinical notes:
- This was recently discussed on wikitech-l
- More than two intersections are implemented by nesting subqueries
- Hash values are implemented as VARCHAR(32). Could easily
switch to INTEGER if desirable (less storage, faster lookup, but more false positives)
- The hash values will only give good candidates (pages that
*might* intersect in these categories). The candidates have then to be checked in a second run, which will have to be optimized; database people to the front!
- Table to store hash values has to be created manually; SQL
is in the main file
- I didn't implement code to fill the table for an existing
installation; however, since hash table updates solely hang on the LinksUpdate hook, this should be easy
- There is no code covering page moves and deletions yet; do
those hang on LinksUpdate as well?
- SQL queries are currently "plain text" and not constructed
through the DB wrappers; I wan't sure how to do that for the subqueries
Cheers, Magnus
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l