Thanks, both of you, for your in-depth answers. It helped a lot :)
I have started a branch (category-redirects) to work on this. So far, category redirects are working, and I've created a new Job subclass to handle categorylinks big changes after a category move. Category moves are apparently working, as far as the target category (not category page) does not exist.
I am, however, considering to disallow category moves over an existing category (again, the *category*, not its attached page) : * When the user does this, he probably expects the categories to be merged together. * However as said before, the move has to be reversible: the proper way to do this is to redirect the former category to the latter; reversing this is only deleting a redirect. * But actually, after that redirect, there will be one page for two category objects. If we merged A and B to B, category A will redirect to category B, while the only valid Page will be B. That might not be a big problem, but it differs from the structure I currently use : 1 Category object <-> 1 Page object * It gets worse if you consider a third Category C, that was redirecting to A before the move : The user moves/merges A and B, so he expects C to redirect to the merged categories. However, after the move, we'll only have a redirect chain : C->A->B. And like double redirects for pages, it wont work. I don't see a way to resolve this use case in a reversible manner with my new schema.
Overriding move would be forbidden, and the user would have to create a redirect. Do you see a case where the user would suffer from creating a redirect instead of doing a hard, plain move ?
There is a little problem with that constraint, though : if a category once contained pages, it will still exist. It means that the user can be asked not to move a category to an... empty category. The easy answer here is "just check for category membership, and allow the move when no categorylinks entries point to the target category"; however, how will we deal with CategoryLinksUpdate Jobs that might be pending ? In other words, the target category might appear empty at the query time, when a move / redirect change is undergo. Should I try to address that ? If on moves, I make the job_title the target category title, searching the job table for job_cmd=categoryLinksUpdate AND job_title=target could help, but how does it sound ?
2008/7/1 Simetrical Simetrical+wikilist@gmail.com:
On Tue, Jul 1, 2008 at 12:02 PM, Roan Kattouw roan.kattouw@home.nl wrote:
I had the idea joining on integers would be faster, I guess I underestimated how much. I didn't know about the primary key thing, but it makes sense (it's called *primary* key for a reason).
The primary key benefits are specific to InnoDB, since it clusters the table data in the primary key (basically the table data is in the leaves of the primary key B-tree, if I understand right). A primary key lookup is therefore one B-tree lookup instead of two. In MyISAM, and in many other DBMSes, the primary key isn't special.
With the schema I was backing, yes. However, schema #1 doesn't eliminate the need to check for a category's redirect target when adding a page to it, and since joining on cat_page=page_id is faster than joining on cat_title=page_title (because of the int vs. varchar and primary vs. non-primary issues), that would constitute a "specific performance benefit" for adding cat_page, wouldn't it?
Not a big enough one. Adding an extra column means every row is that much larger, reducing key buffer efficiency and thereby hurting performance slightly for all queries on the table. And if you're going to join on it you also may need an extra index (depending on join direction), which takes time to maintain on every insert and delete and also competes for the cache. Plus you get the headache of denormalization.
In this case it's almost certainly not worth it. In the case of cl_final, where you're avoiding cripplingly large filesorts, it's definitely worth it, because otherwise the feature is completely untenable.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l