Magnus - I checked out your tool, but it looks like you're using a query against the categorylinks table? Have you played with setting up a new table for categories and fulltext indexing it? Use group_concat to get all of a pages categories into one field, then create a fulltext index on that field. You get much better performance than using the categorylinks table (kind of weird, eh?) Are you pinging a live database, or a copy made from a dump? (please excuse my ignorance if this is common knowledge)
I'm working on dummying up a UI using the same approach (fulltext index of categories) on wikidweb and will write back when I've got something worth looking at...
Best Regards, Aerik
On Fri, 05 Dec 2008 13:39:15 -0800, Brion Vibber brion@wikimedia.org wrote:
Aryeh Gregor wrote:
No, someone should *write* a UI. It should be written and added to the software. If it's subpar, fine, it can be improved later. Better that a mediocre UI should be written and committed now than that yet another category intersection discussion should die away as they always do.
I'm Brion Vibber, and I approve this message.
(Note that we can be open to alternative, more efficient backends such as the Postgres system Greg's experimented with, or a Lucene backend, or whatever, but to be something people can actively develop and test with we need to at least have _something_ that works on MySQL, in the core software, available by default.)
Okay, that's a green light if I ever saw one, awesome. So let's create a a "categorysearch" myisam table, stick all the categories in it, set up hooks to maintain it, and implement the "fulltext index solution". We'll use a special page to show the results (?). I'm working on an interface that primarily would depend on two links at the bottom of each article, "find similar articles" and "find related categories" - these bring up articles having the same categories, and a list of top categories belonging to those categories, respectively.
Sound good?
Aerik
On Sat, Dec 6, 2008 at 4:49 PM, aerik@thesylvans.com wrote:
Okay, that's a green light if I ever saw one, awesome. So let's create a a "categorysearch" myisam table, stick all the categories in it, set up hooks to maintain it, and implement the "fulltext index solution". We'll use a special page to show the results (?). I'm working on an interface that primarily would depend on two links at the bottom of each article, "find similar articles" and "find related categories" - these bring up articles having the same categories, and a list of top categories belonging to those categories, respectively.
Sound good?
Sounds vastly better than the absolutely nothing we presently have.
wikitech-l@lists.wikimedia.org